85 research outputs found
Recommended from our members
When humans and machines collaborate: Cross-lingual Label Editing in Wikidata
The quality and maintainability of a knowledge graph are determined by the process in which it is created. There are different approaches to such processes; extraction or conversion of available data in the web (automated extraction of knowledge such as DBpedia from Wikipedia), community-created knowledge graphs, often by a group of experts, and hybrid approaches where humans maintain the knowledge graph alongside bots. We focus in this work on the hybrid approach of human edited knowledge graphs supported by automated tools. In particular, we analyse the editing of natural language data, i.e. labels. Labels are the entry point for humans to understand the information, and therefore need to be carefully maintained. We take a step toward the understanding of collaborative editing of humans and automated tools across languages in a knowledge graph. We use Wikidata as it has a large and active community of humans and bots working together covering over 300 languages. In this work, we analyse the different editor groups and how they interact with the different language data to understand the provenance of the current label data
An investigation of techniques that aim to improve the quality of labels provided by the crowd
The 2013 MediaEval Crowdsourcing task looked at the problem of working with noisy crowdsourced annotations of image data. The aim of the task was to investigate possible techniques for estimating the true labels of an image by using the set of noisy crowdsourced labels, and possibly any content and metadata from the image itself. For the runs in this paper, we’ve applied a shotgun approach and tried a number of existing techniques, which include generative probabilistic models and further crowdsourcing
Recommended from our members
"I Wish You Could Make the Camera Stand Still": Envisioning Media Accessibility Interventions with People with Aphasia
Audiovisual media is integral to modern living, yet is not always accessible to all. Modern accessibility interventions, such as subtitles, support many, however, communities with complex communication needs are largely unconsidered. In this work, we envision future accessibility interventions from the ground up with one such community – people with aphasia. Over two workshops and a probe activity, we problematise the space of audiovisual consumption by people with aphasia, and co-envision directions for development in accessible audiovisual media. From low-fi diegetic prototypes to mid-fidelity solutions, we explore new visions of accessibility interventions for complex communication needs – notably enabling high levels of content manipulation and personalisation. Our findings raise open questions and set directions for the research community in developing accessibility interventions for audiovisual media to support users with diverse needs in accessing audiovisual content
Human Computation and Convergence
Humans are the most effective integrators and producers of information,
directly and through the use of information-processing inventions. As these
inventions become increasingly sophisticated, the substantive role of humans in
processing information will tend toward capabilities that derive from our most
complex cognitive processes, e.g., abstraction, creativity, and applied world
knowledge. Through the advancement of human computation - methods that leverage
the respective strengths of humans and machines in distributed
information-processing systems - formerly discrete processes will combine
synergistically into increasingly integrated and complex information processing
systems. These new, collective systems will exhibit an unprecedented degree of
predictive accuracy in modeling physical and techno-social processes, and may
ultimately coalesce into a single unified predictive organism, with the
capacity to address societies most wicked problems and achieve planetary
homeostasis.Comment: Pre-publication draft of chapter. 24 pages, 3 figures; added
references to page 1 and 3, and corrected typ
A Model for Language Annotations on the Web
Several annotation models have been proposed to enable a multilingual Semantic Web. Such models hone in on the word and its morphology and assume the language tag and URI comes from external resources. These resources, such as ISO 639 and Glottolog, have limited coverage of the world's languages and have a very limited thesaurus-like structure at best, which hampers language annotation, hence constraining research in Digital Humanities and other fields. To resolve this `outsourced' task of the current models, we developed a model for representing information about languages, the \textbf{Mo}del for \textbf{L}anguage \textbf{A}nnotation (\langmod{}), such that basic language information can be recorded consistently and therewith queried and analyzed as well. This includes the various types of languages, families, and the relations among them. \langmod{} is formalized in OWL so that it can integrate with Linguistic Linked Data resources. Sufficient coverage of \langmod{} is demonstrated with the use case of French
An architecture for the autonomic curation of crowdsourced knowledge
Human knowledge curators are intrinsically better than their digital counterparts at providing relevant answers to queries. That is mainly due to the fact that an experienced biological brain will account for relevant community expertise as well as exploit the underlying connections between knowledge pieces when offering suggestions pertinent to a specific question, whereas most automated database managers will not. We address this problem by proposing an architecture for the autonomic curation of crowdsourced knowledge, that is underpinned by semantic technologies. The architecture is instantiated in the career data domain, thus yielding Aviator, a collaborative platform capable of producing complete, intuitive and relevant answers to career related queries, in a time effective manner. In addition to providing numeric and use case based evidence to support these research claims, this extended work also contains a detailed architectural analysis of Aviator to outline its suitability for automatically curating knowledge to a high standard of quality
- …