4,223 research outputs found

    A New Computational Schema for Euphonic Conjunctions in Sanskrit Processing

    Get PDF
    Automated language processing is central to the drive to enable facilitated referencing of increasingly available Sanskrit E-texts. The first step towards processing Sanskrit text involves the handling of Sanskrit compound words that are an integral part of Sanskrit texts. This firstly necessitates the processing of euphonic conjunctions or sandhi-s, which are points in words or between words, at which adjacent letters coalesce and transform. The ancient Sanskrit grammarian P??ini’s codification of the Sanskrit grammar is the accepted authority in the subject. His famed s?tra-s or aphorisms, numbering approximately four thousand, tersely, precisely and comprehensively codify the rules of the grammar, including all the rules pertaining to sandhi-s. This work presents a fresh new approach to processing sandhi-s in terms of a computational schema. This new computational model is based on P??ini’s complex codification of the rules of grammar. The model has simple beginnings and is yet powerful, comprehensive and computationally lean

    Input Scheme for Hindi Using Phonetic Mapping

    Get PDF
    Written Communication on Computers requires knowledge of writing text for the desired language using Computer. Mostly people do not use any other language besides English. This creates a barrier. To resolve this issue we have developed a scheme to input text in Hindi using phonetic mapping scheme. Using this scheme we generate intermediate code strings and match them with pronunciations of input text. Our system show significant success over other input systems available

    Opportunities and Challenges of Handwritten Sanskrit Character Recognition System

    Get PDF
    The rapid growth in the field of internet facilities and digitalization, changes the living way of human being. Due to internet facilities and services, anyone can access data from anywhere. A lot of online data are generating day by day, so that data needs to be processed before extracting the information. Therefore the demand of Natural language Processing (NLP) Techniques has been increased. The Pattern recognition is sub-field of NLP. The field of Pattern Recognition is a branch of machine learning that contributed up to great extent in the Computer Vision and Machine Vision applications. Pattern Recognition is concerned with the recognition of patterns and regularities in data. Handwriting recognition is one of the challenging subtask and current research field under Pattern Recognition, due to different ways of writing and handwriting styles. Handwritten Sanskrit Characters recognition is more complicated than other languages works in online and offline mode, because Sanskrit characters have more consonants and modifiers. In this paper discussed the opportunities and challenges of Handwritten Sanskrit Character Recognition System

    Phonetic Dictionary for Natural Language Processing: Kannada

    Get PDF
    India has 22 officially recognized languages: Assamese, Bengali, English, Gujarati, Hindi, Kannada, Kashmiri, Konkani, Malayalam, Manipuri, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Tamil, Telugu, and Urdu. Clearly, India owns the language diversity problem. In the age of Internet, the multiplicity of languages makes it even more necessary to have sophisticated Systems for Natural Language Process. In this paper we are developing the phonetic dictionary for natural language processing particularly for Kannada. Phonetics is the scientific study of speech sounds. Acoustic phonetics studies the physical properties of sounds and provides a language to distinguish one sound from another in quality and quantity. Kannada language is one of the major Dravidian languages of India. The language uses forty nine phonemic letters, divided into three groups: Swaragalu (thirteen letters); Yogavaahakagalu (two letters); and Vyanjanagalu (thirty-four letters), similar to the vowels and consonants of English, respectively

    Classification of Humans into Ayurvedic Prakruti Types using Computer Vision

    Get PDF
    Ayurveda, a 5000 years old Indian medical science, believes that the universe and hence humans are made up of five elements namely ether, fire, water, earth, and air. The three Doshas (Tridosha) Vata, Pitta, and Kapha originated from the combinations of these elements. Every person has a unique combination of Tridosha elements contributing to a person’s ‘Prakruti’. Prakruti governs the physiological and psychological tendencies in all living beings as well as the way they interact with the environment. This balance influences their physiological features like the texture and colour of skin, hair, eyes, length of fingers, the shape of the palm, body frame, strength of digestion and many more as well as the psychological features like their nature (introverted, extroverted, calm, excitable, intense, laidback), and their reaction to stress and diseases. All these features are coded in the constituents at the time of a person’s creation and do not change throughout their lifetime. Ayurvedic doctors analyze the Prakruti of a person either by assessing the physical features manually and/or by examining the nature of their heartbeat (pulse). Based on this analysis, they diagnose, prevent and cure the disease in patients by prescribing precision medicine. This project focuses on identifying Prakruti of a person by analysing his facial features like hair, eyes, nose, lips and skin colour using facial recognition techniques in computer vision. This is the first of its kind research in this problem area that attempts to bring image processing into the domain of Ayurveda

    MatriVasha: A Multipurpose Comprehensive Database for Bangla Handwritten Compound Characters

    Full text link
    At present, recognition of the Bangla handwriting compound character has been an essential issue for many years. In recent years there have been application-based researches in machine learning, and deep learning, which is gained interest, and most notably is handwriting recognition because it has a tremendous application such as Bangla OCR. MatrriVasha, the project which can recognize Bangla, handwritten several compound characters. Currently, compound character recognition is an important topic due to its variant application, and helps to create old forms, and information digitization with reliability. But unfortunately, there is a lack of a comprehensive dataset that can categorize all types of Bangla compound characters. MatrriVasha is an attempt to align compound character, and it's challenging because each person has a unique style of writing shapes. After all, MatrriVasha has proposed a dataset that intends to recognize Bangla 120(one hundred twenty) compound characters that consist of 2552(two thousand five hundred fifty-two) isolated handwritten characters written unique writers which were collected from within Bangladesh. This dataset faced problems in terms of the district, age, and gender-based written related research because the samples were collected that includes a verity of the district, age group, and the equal number of males, and females. As of now, our proposed dataset is so far the most extensive dataset for Bangla compound characters. It is intended to frame the acknowledgment technique for handwritten Bangla compound character. In the future, this dataset will be made publicly available to help to widen the research.Comment: 19 fig, 2 tabl

    Secularism's names:Commitment to confusion and the pedagogy of the name

    Get PDF
    This essay takes up social and political questions of naming that are often ignored in studies of inequality or exclusion. What if South Asian personal names ceased to reveal demographic ‘data’ about their bearers, scrambling any attempt at automatic categorization? The focus here is on naming and/or renaming for ideological reasons, and in such ways that the identity of the bearer is deliberately blurred. Grounded in ethnographic work amongst committed proponents of secularism in India (principally rationalist, humanist, and atheist activists), the essay identifies two main strategies that activists use for the production of ‘disidentification’: purification of the caste and religious connotations of names, and multiplication of those connotations in the giving of boundary-crossing names. Common to each is a rationale that seeks to break the association between name and pigeonholed identity. However, acts of renaming, and non-normative names as such, can be and are contested. Thus, in order to clarify what is at stake in the domain of secular naming practices the essay also focuses on debates and criticisms from both within and outside it
    corecore