35 research outputs found

    Design of an Offline Handwriting Recognition System Tested on the Bangla and Korean Scripts

    Get PDF
    This dissertation presents a flexible and robust offline handwriting recognition system which is tested on the Bangla and Korean scripts. Offline handwriting recognition is one of the most challenging and yet to be solved problems in machine learning. While a few popular scripts (like Latin) have received a lot of attention, many other widely used scripts (like Bangla) have seen very little progress. Features such as connectedness and vowels structured as diacritics make it a challenging script to recognize. A simple and robust design for offline recognition is presented which not only works reliably, but also can be used for almost any alphabetic writing system. The framework has been rigorously tested for Bangla and demonstrated how it can be transformed to apply to other scripts through experiments on the Korean script whose two-dimensional arrangement of characters makes it a challenge to recognize. The base of this design is a character spotting network which detects the location of different script elements (such as characters, diacritics) from an unsegmented word image. A transcript is formed from the detected classes based on their corresponding location information. This is the first reported lexicon-free offline recognition system for Bangla and achieves a Character Recognition Accuracy (CRA) of 94.8%. This is also one of the most flexible architectures ever presented. Recognition of Korean was achieved with a 91.2% CRA. Also, a powerful technique of autonomous tagging was developed which can drastically reduce the effort of preparing a dataset for any script. The combination of the character spotting method and the autonomous tagging brings the entire offline recognition problem very close to a singular solution. Additionally, a database named the Boise State Bangla Handwriting Dataset was developed. This is one of the richest offline datasets currently available for Bangla and this has been made publicly accessible to accelerate the research progress. Many other tools were developed and experiments were conducted to more rigorously validate this framework by evaluating the method against external datasets (CMATERdb 1.1.1, Indic Word Dataset and REID2019: Early Indian Printed Documents). Offline handwriting recognition is an extremely promising technology and the outcome of this research moves the field significantly ahead

    On Improving Generalization of CNN-Based Image Classification with Delineation Maps Using the CORF Push-Pull Inhibition Operator

    Get PDF
    Deployed image classification pipelines are typically dependent on the images captured in real-world environments. This means that images might be affected by different sources of perturbations (e.g. sensor noise in low-light environments). The main challenge arises by the fact that image quality directly impacts the reliability and consistency of classification tasks. This challenge has, hence, attracted wide interest within the computer vision communities. We propose a transformation step that attempts to enhance the generalization ability of CNN models in the presence of unseen noise in the test set. Concretely, the delineation maps of given images are determined using the CORF push-pull inhibition operator. Such an operation transforms an input image into a space that is more robust to noise before being processed by a CNN. We evaluated our approach on the Fashion MNIST data set with an AlexNet model. It turned out that the proposed CORF-augmented pipeline achieved comparable results on noise-free images to those of a conventional AlexNet classification model without CORF delineation maps, but it consistently achieved significantly superior performance on test images perturbed with different levels of Gaussian and uniform noise

    Provincialising Bollywood: Bhojpuri cinema and the vernacularisation of North Indian media

    Get PDF
    This thesis is an investigation of the explosive growth of Bhojpuri cinema alongside the vernacularisation of north Indian media in the last decade. As these developments take place under the shadow of Bollywood, the thesis also studies the aesthetic, political, and infrastructural nature of the relationship between vernacular media industries – Bhojpuri in particular – and Bollywood. The thesis then argues that Bhojpuri cinema, even as it provincialises Bollywood, aspires to sit beside it instead of displacing it. The outrightly confrontational readings notwithstanding, the thesis grapples with the ways in which the vernacular departs from its corresponding cosmopolitan form and how it negotiates cultural representation as an industry. The two chapters in Part I provide a narrative account of the discourses and media-texts that saturate the Bhojpuri public sphere. The prevailing discourses and the dominant texts, the thesis argues, resonate with each other, but also delimit the destiny of Bhojpuri film and media. The tug of war between the cultural and economic valuations of the Bhojpuri commodity, as between enchantment and discontent with its representative prowess, as also between ‘traditional’ values and reformist ‘modernity’, leaves us within an uncomfortable zone. The thesis shows how aspirations to male stardom consolidate this territory and become the logic by which the industry output keeps growing, in spite of a failing media economy. Each of the three chapters in Part II traces the historical trajectory of language, gendered use of public space, and piracy, respectively. In this part, the thesis establishes the analytical provenance for the emergence of Bhojpuri cinema in particular, and vernacular media in general. While Bhojpuri media allows Bhojpuri to seek its autonomy from state-supported Hindi, it also occupied the fringe economy of rundown theatres as Bollywood sought to move towards the multiplexes. If the advent of audiocassettes led to the emergence of Bhojpuri media sanskar, the availability of the single-screen economy after the arrival of multiplexes cleared the space for the theatrical exhibition of Bhojpuri cinema. The suboptimal transactions of counterfeit media commodities, on the other hand, regulate the legal counterpart and widen the net of distribution beyond the film theatre. I argue that the suboptimal practices are embedded within the unstable meanwhile. As an occupant of this meanwhile temporality, Bhojpuri film and media, whether in rundown theatres or on cheap mobile phones, grow via contingent and strategic coalitions. This thesis, then, argues that cinema as a form makes it possible for Bhojpuri speaking society to confront, and reconcile with, its own corporeality – the aural and visual footprints, the discursive and ideological blind spots, and the aspiration to break free. On account of the media economy and its power to ratify a new order of hierarchy via celebrity, Bhojpuri media threatens to transform the social order, yet remains open to the possibility of manipulation by which the old order could rechristen itself as new

    Tune your brown clustering, please

    Get PDF
    Brown clustering, an unsupervised hierarchical clustering technique based on ngram mutual information, has proven useful in many NLP applications. However, most uses of Brown clustering employ the same default configuration; the appropriateness of this configuration has gone predominantly unexplored. Accordingly, we present information for practitioners on the behaviour of Brown clustering in order to assist hyper-parametre tuning, in the form of a theoretical model of Brown clustering utility. This model is then evaluated empirically in two sequence labelling tasks over two text types. We explore the dynamic between the input corpus size, chosen number of classes, and quality of the resulting clusters, which has an impact for any approach using Brown clustering. In every scenario that we examine, our results reveal that the values most commonly used for the clustering are sub-optimal

    Fringe Map Based Text Line Segmentation of Printed Telugu Document Images

    No full text

    Colonial mutations of caste in Tamil Nadu : an essay on space and untouchability, with special reference to Madurai district, c.1500-1990

    No full text
    After the lengthy war of conquest, the British installed the Permanent Settlement on much of the dry zone of south India. This was part of an original pacification plan designed to be temporary; however, colonial interests later decided that it was politically convenient to maintain some of the "native rank" in the country. These zamindari estates became precisely the area where caste-inducing pseudo-jajmani systems enjoyed a colonized efflorescence. These changes occurred in the nineteenth century; not all of the peculiar traditions of the south Indian social world pre-date the colonial kali yuga

    Development, Value, and Education in India\u27s Digital Age

    Get PDF
    This ethnography is an attempt to show the particular relationships between globalization, development, digitality, and urban-rural change as they are re-articulated in the actions and interactions between several groups – NGO personnel, teachers, students – living, working, and studying within educational spaces in South Karnataka, in regions in and around Bangalore city. My intervention, to put it simply, is to show how the condition of development in India, and specifically education-as-development, has changed in the contemporary global digital moment, and I identify the new concerns of each of these groups – how they sought to develop themselves and Others – in the wake of technologically-enabled globality and social reform-oriented connection. My own set of ethnographic stories begins at the heart of these education-as-development concerns, but relies on the specificity of my interactions with a single NGO, Adhyaapaka, based in Bangalore, but that worked with school communities outside of it. I have placed these NGO narratives in relation to another set of narratives from one school site in which Adhyaapaka works, Adavisandra school. What I discovered, inadvertently, was an alternative shape that global development takes when seen through the stories of teachers and students, equally tied to the idea of a changing India, but inflected with aspirations and commitments that reflected the unique lived experiences of those who were participating in schooling in the village. This is also to say that, at least in India, any global-digital future is always a “global-urban-rural future” and throughout this study I mark instances of urban-rural linkage and boundary, always as a means to understand how individuals perceive development-based change. To this end, I further the concept of value migrations, a set of mediated imaginings and aspirations that reflect the circulation of values and the concomitant changes wrought in villages. In unpacking the concept of “value” I foreground the inextricable link between global economic structures, human development, and village change. Further, I connect value to affect, showing how structures of economic power work on a psychosocial register, manifesting as dreams, hopes, desires, nostalgias, anxieties, and sufferings and together are what I term the “affects of development”

    Cities in South Asia

    Get PDF
    Globalisation has long historical roots in South Asia, but economic liberalisation has led to uniquely rapid urban growth in South Asia during the past decade. This book brings together a multidisciplinary collection of chapters on contemporary and historical themes explaining this recent explosive growth and transformations on-going in the cities of this region. The essays in this volume attempt to shed light on the historical roots of these cities and the traditions that are increasingly placed under strain by modernity, as well as exploring the lived experience of a new generation of city dwellers and their indelible impact on those who live at the city’s margins. The book discusses that previously, cities such as Mumbai grew by accumulating a vast hinterland of slum-dwellers who depressed wages and supplied cheap labour to the city’s industrial economy. However, it goes on to show that the new growth of cities such as Bangalore, Hyderabad, and Madras in south India, or Delhi and Calcutta in the north of India, is more capital-intensive, export-driven, and oriented towards the information technology and service sectors. The book explains that these cities have attracted a new elite of young, educated workers, with money to spend and an outlook on life that is often a complex mix of modern ideas and conservative tradition. It goes on to cover topics such as the politics of town planning, consumer culture, and the struggles among multiple identities in the city. By tracing the genealogies of cities, it gives a useful insight into the historical conditioning that determines how cities negotiate new changes and influences. There will soon be more mega cities in South Asia than anywhere else in the world, and this book provides an in-depth analysis of this growth. It will be of interest to students and scholars of South Asian History, Politics and Anthropology, as well as those working in the fields of urbanisation and globalisation

    Digital Research Cycles: How Attitudes Toward Content, Culture And Technology Affect Web Development.

    Get PDF
    It has been estimated that one third of the world\u27s population does not have access to adequate health care. Some 1.6 billion people live in countries experiencing concentrated acquired immune deficiency syndrome (AIDS) epidemics. Many countries in Africa--and other low-income countries--are in dire need of help providing adequate health care services to their citizens. They require more hands-on care from Western health workers--and training so more African health workers can eventually care for their own citizens. But these countries also need assistance acquiring and implementing both texts--the body of medical information potentially available to them--and technology--the means by which that information can be conveyed. This dissertation looks at these issues and others from a multi-faceted approach. It combines a survey of the developers of Web sites designed for use by health workers in low-income countries and a proposal for a novel approach to communication theory, which could help improve health communication and other social marketing practices. It also includes an extensive review of literature regarding a number of topics related to these issues. To improve healthcare services in low-income countries, several things should occur. First, more health workers--and others--could visit African countries and other places to provide free, hands-on medical care, as this researcher\u27s group did in Uganda. Such trips are ideal occasions for studying the cultural differences between mzungu (white man) and the Ugandan people. A number of useful medical texts have been written for health workers in low-income countries. Others will be published as new health information becomes available. But on what medium will they be published? Computers? Personal digital assistants? During the past 10 years the Internet became an ideal venue for conveying information. Unfortunately, people in target countries such as Uganda encounter cultural differences when such new technologies are diffused. This dissertation looks at cultural and technological difficulties encountered by people in low-income countries who attempt to diffuse information and communication technologies (ICT). Once a technology has been successfully adopted, someone will look for ways to use it to help others. There are hundreds of sites on the Internet--built by Web developers in Western countries--that are designed for use by health workers in low-income countries. However, these Web developers also experience cultural and technological differences, based on their knowledge of and attitudes toward best practices in their field. This research includes a survey of Web developers which determined their attitudes toward best practices in their field and tested this researcher\u27s hypothesis that there is no significant difference among the developers\u27 attitudes toward the content on their sites, their audience\u27s cultural needs and the various technological needs their audience has. It was found that the Web developers agree with 17 of 18 perceived best practices and that there is a significant difference between Web developers\u27 attitudes toward their audience\u27s technological needs and their attitudes toward quality content and the audience\u27s cultural needs. Creation of the survey herein resulted in this researcher generating a new way of thinking about communication theory--called digital research cycles. The survey was based on a review of literature and is rooted in the belief that any successful communication of a computer-mediated message in the information age is a behavior which is influenced by the senders\u27 and receivers\u27 attitudes and knowledge about textual style, the audience, technology and the subject matter to which the message pertains
    corecore