Search CORE

26 research outputs found

HiPHET: A Hybrid Approach to Translate Code Mixed Language (Hinglish) to Pure Languages (Hindi and English)

Author: Attri Shree Harsh
Attri Shree Harsh
Prasad T.V.
Ramakrishna G.
Publication venue: 'AGHU University of Science and Technology Press'
Publication date: 27/09/2020
Field of study

Bilingual code mixed (hybrid) languages has become very popular in India as a result of the spread of Western technology in the form of the television, the Internet and social media. Due to this increase in usage of code-mixed languages in day-to-day communication, the need for maintaining the integrity of Indian languages has arisen. As a result of this need the tool named Hinglish to Pure Hindi and English Translator was developed. The tool translated in three ways, namely, Hinglish to Pure Hindi and Pure English, Pure Hindi to Pure English and vice versa. The tool has achieved accuracy of 91% in giving Hindi sentences as output and of 84% in giving English sentences as output, where the input sentences were in Hinglish. The tool has also been compared with another similar tool in the paper

AGH (Akademia Górniczo-Hutnicza) University of Science and Technology: Journals

Computer Science Journal (AGH University of Science and Technology, Krakow)

Survey on Hinglish to English Translation and Classification Techniques

Author: D’Souza Nicole
Patel Devarsh
Rao Ashwini
Saravta Jigyashu
Publication venue: Auricle Global Society of Education and Research
Publication date: 01/09/2023
Field of study

Code-mixing is the process of using many languages in one sentence and has a widespread occurrence in multilingual communities. It is particularly prevalent in texts on social media. Due to the widespread usage of social networking sites, a substantial amount of unstructured text is produced. Hinglish, i.e. code-mixed Hindi and English, is a frequent occurrence in everyday language use in India. Hence, a translation process is required to help monolingual users and to aid in the comprehension of language processing models. In this paper, we study the effective techniques for classification and translation tasks and also find gaps and challenges in the current research domain. After comparing a few existing methodologies for machine translation, a framework which showed an improvement in task of translation over the previous methods is proposed. &nbsp

International Journal on Recent and Innovation Trends in Computing and Communication

Identification of monolingual and code-switch information from English-Kannada code-switch data

Author: Chundi Ramesh
Hulipalled Vishwanath R.
Simha Jay Bharthish
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/10/2023
Field of study

Code-switching is a very common occurrence in social media communication, predominantly found in multilingual countries like India. Using more than one language in communication is known as code-switching or code-mixing. Some of the important applications of code-switch are machine translation (MT), shallow parsing, dialog systems, and semantic parsing. Identifying code-switch and monolingual information is useful for better communication in online networking websites. In this paper, we performed a character level n-gram approach to identify monolingual and code-switch information from English-Kannada social media data. We paralleled various machine learning techniques such as naïve Bayes (NB), support vector classifier (SVC), logistic regression (LR) and neural network (NN) on English-Kannada code-switch (EKCS) data. From the proposed approach, it is observed that the character level n-gram approach provides 1.8% to 4.1% of improvement in terms of Accuracy and 1.6% to 3.8% of improvement in F1-score. Also observed that SVC and NN techniques are outperformed in terms of accuracy (97.9%) and F1-score (98%) with character level n-gram

Institute of Advanced Engineering and Science