399 research outputs found

    Quality Assessment in Crowdsourced Indigenous Language Transcription

    Get PDF
    The digital Bleek and Lloyd Collection is a rare collection that contains artwork, notebooks and dictionaries of the indigenous people of Southern Africa. The notebooks, in particular, contain stories that encode the language, culture and beliefs of these people, handwritten in now-extinct languages with a specialised notation system. Previous attempts have been made to convert the approximately 20000 pages of text to a machine-readable form using machine learning algorithms but, due to the complexity of the text, the recognition accuracy was low. In this paper, a crowdsourcing method is proposed to transcribe the manuscripts, where non-expert volunteers transcribe pages of the notebooks using an online tool. Experiments were conducted to determine the quality and consistency of transcriptions. The results show that volunteeers are able to produce reliable transcriptions of high quality. The inter-transcriber agreement is 80% for |Xam text and 95% for English text. When the |Xam text transcriptions produced by the volunteers are compared with a gold standard, the volunteers achieve an average accuracy of 64.75%, which exceeded that in previous work. Finally, the degree of transcription agreement correlates with the degree of transcription accuracy. This suggests that the quality of unseen data can be assessed based on the degree of agreement among transcribers

    A System for High Quality Crowdsourced Indigenous Language Transcription

    Get PDF
    In this article, a crowdsourcing method is proposed to transcribe manuscripts from the Bleek and Lloyd Collection, where non-expert volunteers transcribe pages of the handwritten text using an online tool. The digital Bleek and Lloyd Collection is a rare collection that contains artwork, notebooks and dictionaries of the indigenous people of Southern Africa. The notebooks, in particular, contain stories that encode the language, culture and beliefs of these people, handwritten in now-extinct languages with a specialised notation system. Previous attempts have been made to convert the approximately 20000 pages of text to a machine-readable form using machine learning algorithms but, due to the complexity of the text, the recognition accuracy was low. This article presents details of the system used to enable transcription by volunteers as well as results from experiments that were conducted to determine the quality and consistency of transcriptions. The results show that volunteeers are able to produce reliable transcriptions of high quality. The inter-transcriber agreement is 80% for |Xam text and 95% for English text. When the |Xam text transcriptions produced by the volunteers are compared with a gold standard, the volunteers achieve an average accuracy of 64.75%, which exceeded that in previous work. Finally, the degree of transcription agreement correlates with the degree of transcription accuracy. This suggests that the quality of unseen data can be assessed based on the degree of agreement among transcribers

    Citizen Science in Archaeology

    Get PDF
    Citizen science, as a process of volunteer participation through crowdsourcing, facilitates the creation of mass data sets needed to address subtle and large-scale patterns in complex phenomena. Citizen science efforts in other field disciplines such as biology, geography, and astronomy indicate how new web-based interfaces can enhance and expand upon archaeologists' existing platforms of volunteer engagement such as field schools, community archaeology, site stewardship, and professional-avocational partnerships. Archaeological research can benefit from the citizen science paradigm in four ways: fieldwork that makes use of widely available technologies such as mobile applications for photography and data upload; searches of large satellite image collections for site identification and monitoring; crowdfunding; and crowdsourced computer entry of heritage data

    Sourcing Success: Assessment Techniques of Digital Cultural Heritage Crowdsourcing Projects

    Get PDF
    This study focuses on how libraries, archives, museums, and other cultural heritageinstitutions define and assess the success of online crowdsourcing projects. The researchwas conducted via a survey of twenty-two digital crowdsourcing projects ranging fromtranscription of digitized archival materials to wildlife documentation projects.The survey found that institutions had diverse reasons for undertaking crowdsourcingprojects and monitored project success through multiple assessment measures dependenton project goals. Survey respondents reported greater satisfaction with their projectoutcomes when they had identified at least one measurable goal prior to starting theproject. In general, survey respondents reported positive feelings about, and an interest infuture crowdsourcing projects as tools for description, community engagement, and userrecruitment.Master of Science in Library Scienc

    Xamobile: Usability Evaluation of Text Input Methods on Mobile Devices for Historical African Languages

    Get PDF
    Customized text input editors on mobile devices for languages with no standard language models, such as some African languages, are vital to allow text input tasks to be crowdsourced and thus enable quick and precise participation. We investigated 4 different mobile input techniques for complex language scripts like |Xam and collected accuracy data from experiments with the Xwerty, T9, Pinyin script and hierarchical entry methods for mobile devices and also usability data from the participants. Our results on usability testing show that Xwerty methods offer substantial benefits to the majority of users in terms of speed for |Xam text entry and ease of use

    Geoinformatics in Citizen Science

    Get PDF
    The book features contributions that report original research in the theoretical, technological, and social aspects of geoinformation methods, as applied to supporting citizen science. Specifically, the book focuses on the technological aspects of the field and their application toward the recruitment of volunteers and the collection, management, and analysis of geotagged information to support volunteer involvement in scientific projects. Internationally renowned research groups share research in three areas: First, the key methods of geoinformatics within citizen science initiatives to support scientists in discovering new knowledge in specific application domains or in performing relevant activities, such as reliable geodata filtering, management, analysis, synthesis, sharing, and visualization; second, the critical aspects of citizen science initiatives that call for emerging or novel approaches of geoinformatics to acquire and handle geoinformation; and third, novel geoinformatics research that could serve in support of citizen science

    Citizen Science: Reducing Risk and Building Resilience to Natural Hazards

    Get PDF
    Natural hazards are becoming increasingly frequent within the context of climate change—making reducing risk and building resilience against these hazards more crucial than ever. An emerging shift has been noted from broad-scale, top-down risk and resilience assessments toward more participatory, community-based, bottom-up approaches. Arguably, non-scientist local stakeholders have always played an important role in risk knowledge management and resilience building. Rapidly developing information and communication technologies such as the Internet, smartphones, and social media have already demonstrated their sizeable potential to make knowledge creation more multidirectional, decentralized, diverse, and inclusive (Paul et al., 2018). Combined with technologies for robust and low-cost sensor networks, various citizen science approaches have emerged recently (e.g., Haklay, 2012; Paul et al., 2018) as a promising direction in the provision of extensive, real-time information for risk management (as well as improving data provision in data-scarce regions). It can serve as a means of educating and empowering communities and stakeholders that are bypassed by more traditional knowledge generation processes. This Research Topic compiles 13 contributions that interrogate the manifold ways in which citizen science has been interpreted to reduce risk against hazards that are (i) water-related (i.e., floods, hurricanes, drought, landslides); (ii) deep-earth-related (i.e., earthquakes and volcanoes); and (iii) responding to global environmental change such as sea-level rise. We have sought to analyse the particular failures and successes of natural hazards-related citizen science projects: the objective is to obtain a clearer understanding of “best practice” in a citizen science context

    Subtitling for a global audience. Handling the translation of culture-specific items in TEDx talks

    Full text link
    [EN] TED.com is a platform to share ideas through influential talks in video format on topics that range from science and technology to business that engages volunteers from all over the world to help transcribe, subtitle and translate their scripts in more than 100 languages. The justification to engage volunteer transcribers is that transcribed talks can reach a wider audience because they are accessible for hearing impaired individuals, can be indexed in search engines and can achieve TED¿s mission of spreading ideas by making transcripts available for translation through TED¿s Open Translation Project. Therefore, talks transcribers play a crucial role in the overall translation workflow and dissemination process as they are responsible for transcribing the contents and foundations of what will be later on translated into different languages. The objective of this paper is to analyse a corpus of talks originally delivered in different variants of Spanish to identify the most common strategies used by volunteer transcribers to handle local or idiomatic expressions and culturally biased items to reach the maximum audience possible and facilitate translation.Candel-Mora, MÁ.; González Pastor, DM. (2017). Subtitling for a global audience. Handling the translation of culture-specific items in TEDx talks. FORUM. Revue internationale d interprétation et de traduction. International Journal of Interpretation and Translation. 15(2):288-304. doi:10.1075/forum.15.2.07canS28830415
    corecore