6 research outputs found
The Technological Developments of the Dutch Folktale Database (1994–2016)
L’any 1994, la base de dades holandesa de contes populars va començar com una base de dades independent i es va posar en lÃnia el 2004. Des de l’any 2016 i després de dos projectes importants, tots els tipus de metadades es poden afegir de manera automà tica i semisupervisada: idiomes, noms, paraules clau, resums, subgèneres, motius i tipus de contes. Amb aquesta finalitat, la base de dades va analitzar una nova plataforma anomenada Omeka que s’adapta a les necessitats de moltes bases de dades en les humanitats, i que pot gestionar tot tipus de connectors. S’han utilitzat les tècniques següents: n-grames, detecció del llenguatge, reconeixement d’entitats nombrades, extracció de paraules clau, resum, bossa de paraules, aprenentatge automà tic i processament de llenguatge natural. A més de MOMFER, també s’ha afegit un motor de cerca de motius. La interpretació de dades es facilita amb els nous mitjans de visualització: mapes geogrà fics, lÃnies de temps, una xarxa de contes similars i núvols de paraules. Com que la base de dades compleix els requisits de Dublin Core, es pot connectar a bases de dades similars o a un recol·lector de dades. Recentment, s’ha creat una aplicació de mineria de dades transatlà ntica per construir un recol·lector anomenat ISEBEL: Intelligent Search Engine for Belief Legends (motor de cerca intel·ligent de llegendes de creences). El recol·lector ha de ser capaç de buscar en una base de dades holandesa, danesa i alemanya simultà niament. Més endavant s'hi poden afegir altres bases de dades
Supporting the Exploration of Online Cultural Heritage Collections:The Case of the Dutch Folktale Database
This paper demonstrates the use of a user-centred design approach for the development of generous interfaces/rich prospect browsers for an online cultural heritage collection, determining its primary user groups and designing different browsing tools to cater to their specific needs. We set out to solve a set of problems faced by many online cultural heritage collections. These problems are lack of accessibility, limited functionalities to explore the collection through browsing, and risk of less known content being overlooked. The object of our study is the Dutch Folktale Database, an online collection of tens of thousands of folktales from the Netherlands. Although this collection was designed as a research commodity for folktale experts, its primary user group consists of casual users from the general public. We present the new interfaces we developed to facilitate browsing and exploration of the collection by both folktale experts and casual users. We focus on the user-centred design approach we adopted to develop interfaces that would fit the users' needs and preferences
Quality Control in Crowdsourcing: A Survey of Quality Attributes, Assessment Techniques and Assurance Actions
Crowdsourcing enables one to leverage on the intelligence and wisdom of
potentially large groups of individuals toward solving problems. Common
problems approached with crowdsourcing are labeling images, translating or
transcribing text, providing opinions or ideas, and similar - all tasks that
computers are not good at or where they may even fail altogether. The
introduction of humans into computations and/or everyday work, however, also
poses critical, novel challenges in terms of quality control, as the crowd is
typically composed of people with unknown and very diverse abilities, skills,
interests, personal objectives and technological resources. This survey studies
quality in the context of crowdsourcing along several dimensions, so as to
define and characterize it and to understand the current state of the art.
Specifically, this survey derives a quality model for crowdsourcing tasks,
identifies the methods and techniques that can be used to assess the attributes
of the model, and the actions and strategies that help prevent and mitigate
quality problems. An analysis of how these features are supported by the state
of the art further identifies open issues and informs an outlook on hot future
research directions.Comment: 40 pages main paper, 5 pages appendi
Using Crowdsourcing to Investigate Perception of Narrative Similarity
For many applications measuring the similarity between documents is essential. However, little is known about how users perceive similarity between documents. This paper presents the first large-scale empirical study that investigates perception of narrative similarity using crowdsourcing. As a dataset we use a large collection of Dutch folk narratives. We study the perception of narrative similarity by both experts and non-experts by analyzing their similarity ratings and motivations for these ratings. While experts focus mostly on the plot, characters and themes of narratives, non-experts also pay attention to dimensions such as genre and style. Our results show that a more nuanced view is needed of narrative similarity than captured by story types, a concept used by scholars to group similar folk narratives. We also evaluate to what extent unsupervised and supervised models correspond with how humans perceive narrative similarity
Using crowdsourcing to investigate perception of narrative similarity
For many applications measuring the similarity between documents is essential. However, little is known about how users perceive similarity between documents. This paper presents the first large-scale empirical study that investigates perception of narrative similarity using crowdsourcing. As a dataset we use a large collection of Dutch folk narratives. We study the perception of narrative similarity by both experts and non-experts by analyzing their similarity ratings and motivations for these ratings. While experts focus mostly on the plot, characters and themes of narratives, non-experts also pay attention to dimensions such as genre and style. Our results show that a more nuanced view is needed of narrative similarity than captured by story types, a concept used by scholars to group similar folk narratives. We also evaluate to what extent unsupervised and supervised models correspond with how humans perceive narrative similarity