87 research outputs found

    A Comparison of Machine Learning Techniques for Handwritten |Xam Word Recognition

    Get PDF
    The Bleek and Lloyd collection contains 19th century handwritten notebooks that document the language and culture of the |Xam-speaking people who lived in Southern Africa. Access to this rich data could be enhanced by transcriptions of the text; however, the complex diacritics used in the notebooks complicate the process of transcription. Machine learning techniques could be used to perform this transcription, but it is not known which techniques would produce the best results. This paper thus reports on a comparison of 3 popular techniques applied to this problem: artificial neural networks (ANN); hidden Markov models (HMM); and support vector machines (SVM). It was found that an SVM-based classifier using histograms of oriented gradients as features resulted in the best word recognition accuracy of 58.4%. Furthermore, it was found that most feature extraction parameters did not have a large effect on recognition accuracy and that the SVM-based recognisers outperform both ANN- and HMM-based recognisers

    Transcription of the Bleek and Lloyd Collection using the Bossa Volunteer Thinking Framework

    Get PDF
    The digital Bleek and Lloyd Collection is a rare collection that contains artwork, notebooks and dictionaries of the earliest habitants of Southern Africa. Previous attempts have been made to recognize the complex text in the notebooks using machine learning techniques, but due to the complexity of the manuscripts the recognition accuracy was low. In this research, a crowdsourcing based method is proposed to transcribe the historical handwritten manuscripts, where volunteers transcribe the notebooks online. An online crowdsourcing transcription tool was developed and deployed. Experiments were conducted to determine the quality of transcriptions and accuracy of the volunteers compared with a gold standard. The results show that volunteers are able to produce reliable transcriptions of high quality. The inter-transcriber agreement is 80% for |Xam text and 95% for English text. When the |Xam text transcriptions produced by the volunteers are compared with the gold standard, the volunteers achieve an average accuracy of 69.69%. Findings show that there exists a positive linear correlation between the inter-transcriber agreement and the accuracy of transcriptions. The user survey revealed that volunteers found the transcription process enjoyable, though it was difficult. Results indicate that volunteer thinking can be used to crowdsource intellectually-intensive tasks in digital libraries like transcription of handwritten manuscripts. Volunteer thinking outperforms machine learning techniques at the task of transcribing notebooks from the Bleek and Lloyd Collection

    Quality Assessment in Crowdsourced Indigenous Language Transcription

    Get PDF
    The digital Bleek and Lloyd Collection is a rare collection that contains artwork, notebooks and dictionaries of the indigenous people of Southern Africa. The notebooks, in particular, contain stories that encode the language, culture and beliefs of these people, handwritten in now-extinct languages with a specialised notation system. Previous attempts have been made to convert the approximately 20000 pages of text to a machine-readable form using machine learning algorithms but, due to the complexity of the text, the recognition accuracy was low. In this paper, a crowdsourcing method is proposed to transcribe the manuscripts, where non-expert volunteers transcribe pages of the notebooks using an online tool. Experiments were conducted to determine the quality and consistency of transcriptions. The results show that volunteeers are able to produce reliable transcriptions of high quality. The inter-transcriber agreement is 80% for |Xam text and 95% for English text. When the |Xam text transcriptions produced by the volunteers are compared with a gold standard, the volunteers achieve an average accuracy of 64.75%, which exceeded that in previous work. Finally, the degree of transcription agreement correlates with the degree of transcription accuracy. This suggests that the quality of unseen data can be assessed based on the degree of agreement among transcribers

    A System for High Quality Crowdsourced Indigenous Language Transcription

    Get PDF
    In this article, a crowdsourcing method is proposed to transcribe manuscripts from the Bleek and Lloyd Collection, where non-expert volunteers transcribe pages of the handwritten text using an online tool. The digital Bleek and Lloyd Collection is a rare collection that contains artwork, notebooks and dictionaries of the indigenous people of Southern Africa. The notebooks, in particular, contain stories that encode the language, culture and beliefs of these people, handwritten in now-extinct languages with a specialised notation system. Previous attempts have been made to convert the approximately 20000 pages of text to a machine-readable form using machine learning algorithms but, due to the complexity of the text, the recognition accuracy was low. This article presents details of the system used to enable transcription by volunteers as well as results from experiments that were conducted to determine the quality and consistency of transcriptions. The results show that volunteeers are able to produce reliable transcriptions of high quality. The inter-transcriber agreement is 80% for |Xam text and 95% for English text. When the |Xam text transcriptions produced by the volunteers are compared with a gold standard, the volunteers achieve an average accuracy of 64.75%, which exceeded that in previous work. Finally, the degree of transcription agreement correlates with the degree of transcription accuracy. This suggests that the quality of unseen data can be assessed based on the degree of agreement among transcribers

    Learning to Read Bushman: Automatic Handwriting Recognition for Bushman Languages

    Get PDF
    The Bleek and Lloyd Collection contains notebooks that document the tradition, language and culture of the Bushman people who lived in South Africa in the late 19th century. Transcriptions of these notebooks would allow for the provision of services such as text-based search and text-to-speech. However, these notebooks are currently only available in the form of digital scans and the manual creation of transcriptions is a costly and time-consuming process. Thus, automatic methods could serve as an alternative approach to creating transcriptions of the text in the notebooks. In order to evaluate the use of automatic methods, a corpus of Bushman texts and their associated transcriptions was created. The creation of this corpus involved: the development of a custom method for encoding the Bushman script, which contains complex diacritics; the creation of a tool for creating and transcribing the texts in the notebooks; and the running of a series of workshops in which the tool was used to create the corpus. The corpus was used to evaluate the use of various techniques for automatically transcribing the texts in the corpus in order to determine which approaches were best suited to the complex Bushman script. These techniques included the use of Support Vector Machines, Artificial Neural Networks and Hidden Markov Models as machine learning algorithms, which were coupled with different descriptive features. The effect of the texts used for training the machine learning algorithms was also investigated as well as the use of a statistical language model. It was found that, for Bushman word recognition, the use of a Support Vector Machine with Histograms of Oriented Gradient features resulted in the best performance and, for Bushman text line recognition, Marti & Bunke features resulted in the best performance when used with Hidden Markov Models. The automatic transcription of the Bushman texts proved to be difficult and the performance of the different recognition systems was largely affected by the complexities of the Bushman script. It was also found that, besides having an influence on determining which techniques may be the most appropriate for automatic handwriting recognition, the texts used in a automatic handwriting recognition system also play a large role in determining whether or not automatic recognition should be attempted at all

    Automatic Analysis of Archimedes’ Spiral for Characterization of Genetic Essential Tremor Based on Shannon’s Entropy and Fractal Dimension

    Get PDF
    Among neural disorders related to movement, essential tremor has the highest prevalence; in fact, it is twenty times more common than Parkinson's disease. The drawing of the Archimedes' spiral is the gold standard test to distinguish between both pathologies. The aim of this paper is to select non-linear biomarkers based on the analysis of digital drawings. It belongs to a larger cross study for early diagnosis of essential tremor that also includes genetic information. The proposed automatic analysis system consists in a hybrid solution: Machine Learning paradigms and automatic selection of features based on statistical tests using medical criteria. Moreover, the selected biomarkers comprise not only commonly used linear features (static and dynamic), but also other non-linear ones: Shannon entropy and Fractal Dimension. The results are hopeful, and the developed tool can easily be adapted to users; and taking into account social and economic points of view, it could be very helpful in real complex environments.This research was partially funded by the Basque Goverment, the University of the Basque Country by the IT1115-16 project-ELEKIN, Diputacion Foral de Gipuzkoa, University of Vic-Central University of Catalonia under the research grant R0947, and the Spanish Ministry of Science and Innovation TEC2016-77791-C04-R

    Bayesian Data Augmentation and Generative Active Learning for Robust Imbalanced Deep Learning

    Get PDF
    Deep learning has become a leading machine learning approach in many domains such as image classification, face recognition, and autonomous driving cars. However, its success is predicated on the availability of immense labelled training sets. Furthermore, it is usually the case that these data sets need to be well-balanced, otherwise the performance of the trained model is compromised. The outstanding performance of deep learning compared to other traditional machine learning approaches is therefore traded off by the need of a significant amount of human resources for labelling and computational resources for training. Designing effective deep learning approaches that can perform well using small and imbalanced labelled training sets is essential since that will increase the use of deep learning in many real-life applications. In this thesis, we investigate several learning approaches that aim to improve the data efficiency in training deep models. In particular, we propose novel effective learning methods that enable deep learning models to perform well with relatively small and imbalanced labelled training sets. We first introduce a novel theoretically sound Bayesian data augmentation (BDA) method motivated by the fact that the current dominant data augmentation (DA), based on small geometric and appearance transformations of the original training samples, does not guarantee the usefulness and the realism of the generated samples. We formulate BDA with the generalised Monte-Carlo expectation maximisation (GMCEM).We theoretically show the weak convergence of GMCEM and introduce an implementation of BDA based on a variant of the generative adversarial network (GAN). We empirically demonstrate that our proposed BDA performs better than the dominant DA above. One of the drawbacks of BDA mentioned above is that the generation of synthetic training samples is performed without considering their informativeness to the training process. Therefore, we next propose a new Bayesian generative active deep learning (BGADL) approach that aims to train a generative model to produce novel informative training samples. We formulate this algorithm based on a theoretically sound combination of the Bayesian active learning by disagreement (BALD) and BDA, where BALD guides BDA to produce synthetic samples. We provide a formal proof that these generated samples are informative for the training process. We provide empirical evidence that our proposed BGADL outperforms BDA and BALD with respect to training efficiency and classification accuracy. The Bayesian generative active deep learning above does not properly handle class imbalanced training that may occur in the updated training sets formed at each iteration of the algorithm. We extend BGADL with an approach that is robust to imbalanced training data by combining it with a sample re-weighting learning approach. We empirically demonstrate that the extended BGADL performs well on several imbalanced data sets and produce better classification results compared to other baselines. In summary, the contributions of this thesis are the introduction of the following novel methods: Bayesian data augmentation, Bayesian generative active deep learning, and a robust Bayesian generative active deep learning for imbalanced learning. All of those contributions are supported by theoretical justifications, empirical evidence and published or submitted papers.Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 201

    Using contour information and segmentation for object registration, modeling and retrieval

    Get PDF
    This thesis considers different aspects of the utilization of contour information and syntactic and semantic image segmentation for object registration, modeling and retrieval in the context of content-based indexing and retrieval in large collections of images. Target applications include retrieval in collections of closed silhouettes, holistic w ord recognition in handwritten historical manuscripts and shape registration. Also, the thesis explores the feasibility of contour-based syntactic features for improving the correspondence of the output of bottom-up segmentation to semantic objects present in the scene and discusses the feasibility of different strategies for image analysis utilizing contour information, e.g. segmentation driven by visual features versus segmentation driven by shape models or semi-automatic in selected application scenarios. There are three contributions in this thesis. The first contribution considers structure analysis based on the shape and spatial configuration of image regions (socalled syntactic visual features) and their utilization for automatic image segmentation. The second contribution is the study of novel shape features, matching algorithms and similarity measures. Various applications of the proposed solutions are presented throughout the thesis providing the basis for the third contribution which is a discussion of the feasibility of different recognition strategies utilizing contour information. In each case, the performance and generality of the proposed approach has been analyzed based on extensive rigorous experimentation using as large as possible test collections

    The Right to Pain and the Limits of Testimony

    Full text link
    “The Right to Pain and the Limits of Testimony” centers on two questions: Who has the right to pain? Who is permitted to speak about issues of injustice affecting them? I contend that the indifference, disavowal, or appropriation of pain results from a structure of witnessing that accepts violence and injuries on some subjects as deserving, natural, or unreal. Pain, illness, and disability are naturalized within marginalized communities because the terms of death are gendered, racialized, classed, and ableist. By analyzing Vietnamese American memoirs, novels, photography, and the War Remnants Museum in Ho Chi Minh City as testimonies, I make connections among Critical Refugee Studies, Disability Studies, and Visual Culture Studies to reveal the military, economic, and racial systems that expose immigrants and refugees to overlooked danger. These textual, visual, and physical sites present the debilitations produced by the Vietnam-US War by adopting strategic frames of reference, narrative construction, and language in order to connect with the observers, revealing that witnessing maintains an asymmetrical power structure. Models of witnessing give too much interpretive power to the witness, allowing a privileged group to define what counts as pain, the identity of victims, the language of testimony, and appropriate reparation and healing methods. In the closed system of witnessing, the savior is often also the perpetrator. By looking at moments in which witnessing fails the testifier, I deromanticize witnessing, shift interpretative power, and assemble an alternative archive on the Vietnam-US War, pain, and healing. My dissertation presumes that witnessing fails while clings to the potential of testimony. Each chapter examines the body as testimony—the visible Agent Orange impairment and invisible illnesses of transhistorical pain and synesthesia—as another way to know the war. The disability as an index of debilitation is a bridge that links history to the present, event to language, self to an audience, imbuing the testimony with urgency and ethical dimensions. My attention to the limitations of witnessing raises concerns and strategies for accounting for silent voices. My project promotes the value of the victim’s language, frame of reference, unique vision, and particular demands in order to resist the listener’s power as ultimate savior in the exchange. I view testifiers as neither innocent victims nor unfeeling objects but as complex agents, intensely negotiating motives, languages, frameworks and multiple audiences. The focus on pain accounts for the complexities and the ongoingness of debility of Vietnamese affected by wars, colonialism, poverty, and dislocation. Recognizing the limits of visibility to engender compassion and necessary changes, my work also attends to personal and communal healing strategies. The attention to marginalized forms of care emphasizes agency and rejects the white savior complex that underlies leading scholarship on ethics by Emmanuel Levinas, Paul Riceour, and Judith Butler. I attend to the creativity, endurance, and commemoration associated with pain. The survival strategies evident in Vietnamese American art and literature have aesthetic and epistemological value that expand understanding of trauma and reshape engagements with discourses of race, war, globalization, and community formation.PHDEnglish Language & LiteratureUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163097/1/aibinhho_1.pd
    corecore