3,009 research outputs found

    Zone Segmentation and Thinning based Algorithm for Segmentation of Devnagari Text

    Get PDF
    Character segmentation of handwritten documents is an challenging research topic due to its diverse application environment.OCR can be used for automated processing and handling of forms, old corrupted reports, bank cheques, postal codes and structures. Now Segmentation of a word into characters is one of the major challenge in optical character recognition. This is even more challenging when we segment characters in an offline handwritten document and the next hurdle is presence of broken ,touching and overlapped characters in devnagari script. So, in this paper we have introduced an algorithm that will segment both broken as well as touching characters in devnagari script. Now to segment these characters the algorithm uses both zone segmentation and thinning based techniques. We have used 85 words each for isolated, broken, touching and both broken as well as touching characters individually. Results achieved while segmentation of broken as well as touching are 96.2 % on an average

    From words to books

    Get PDF

    A Study of the Apadāna, Including an Edition and Annotated Translation of the Second, Third and Fourth Chapters

    Get PDF
    The Apadāna is a Theravāda Buddhist text in the Pāli language which contains a large collection of “autohagiographies” in verse. It is under-researched, partly because the Pali Text Society edition of this text is not of a high standard and partly because very few of its poems have been translated into any European language. The aim of this thesis is to provide a better understanding of the Apadāna’s content, its relationship to similar texts and the nature of its historical transmission. A series of textual comparisons revealed that the Apadāna has structural, stylistic and thematic similarities to a range of other early Buddhist texts. In particular, the system of karma underlying much of its narrative is reasonably consistent with that of several early Sanskrit avadāna collections, including its basic technical vocabulary. A major component of this thesis is an edition and annotated translation of the second, third and fourth chapters of the Apadāna. This new edition has been edited according to stemmatic principles, using a careful selection of nine palm leaf manuscripts (in Sinhala, Burmese and Khom scripts) and four printed editions (in Roman, Sinhala, Burmese and Thai scripts). The base text of this edition represents the reconstructed archetype of the selected manuscripts, corrected only where absolutely necessary. The corresponding annotated English translation has been produced with critical reference to the text’s primary commentary in Pāli, the Apadānaṭṭhakathā, and a word-by-word Burmese language nissaya translation. A major finding is that existing printed editions of the Apadāna not infrequently include silent emendations of the received text and also often reproduce the “smoother” and more easily understood readings first produced during the editorial preparations to the “fifth Buddhist council” of 1871 in Mandalay. More generally, this thesis demonstrates the indispensability of manuscripts for the historical study of Pāli language and literature

    Acoustic Modelling for Under-Resourced Languages

    Get PDF
    Automatic speech recognition systems have so far been developed only for very few languages out of the 4,000-7,000 existing ones. In this thesis we examine methods to rapidly create acoustic models in new, possibly under-resourced languages, in a time and cost effective manner. For this we examine the use of multilingual models, the application of articulatory features across languages, and the automatic discovery of word-like units in unwritten languages

    Market Lingos and Metrolingua Francas

    Full text link
    © , Copyright © Taylor & Francis Group, LLC. Drawing on data recorded in two city markets, this article analyzes the language practices of workers and customers as they go about their daily business, with a particular focus on the ways in which linguistic resources, everyday tasks, and social spaces are intertwined in producing metrolingua francas. The aim of the article is to come to a better understanding of the relationships among the use of diverse linguistic resources (drawn from different languages, varieties, and registers), the repertoires of the workers, the activities in which they are engaged, and the larger space in which this occurs. Developing the idea of spatial repertoires as the linguistic resources available in particular places, we explore the ways in which metrolingua francas (metrolingual multilingua francas) emerge from the spatial resources of such markets
    corecore