1,683 research outputs found

    HiPHET: A Hybrid Approach to Translate Code Mixed Language (Hinglish) to Pure Languages (Hindi and English)

    Get PDF
    Bilingual code mixed (hybrid) languages has become very popular in India as a result of the spread of Western technology in the form of the television, the Internet and social media. Due to this increase in usage of code-mixed languages in day-to-day communication, the need for maintaining the integrity of Indian languages has arisen. As a result of this need the tool named Hinglish to Pure Hindi and English Translator was developed. The tool translated in three ways, namely, Hinglish to Pure Hindi and Pure English, Pure Hindi to Pure English and vice versa. The tool has achieved accuracy of 91% in giving Hindi sentences as output and of 84% in giving English sentences as output, where the input sentences were in Hinglish. The tool has also been compared with another similar tool in the paper

    Exploration of Corpus Augmentation Approach for English-Hindi Bidirectional Statistical Machine Translation System

    Get PDF
    Even though lot of Statistical Machine Translation(SMT) research work is happening for English-Hindi language pair, there is no effort done to standardize the dataset. Each of the research work uses different dataset, different parameters and different number of sentences during various phases of translation resulting in varied translation output. So comparing  these models, understand the result of these models, to get insight into corpus behavior for these models, regenerating the result of these research work  becomes tedious. This necessitates the need for standardization of dataset and to identify the common parameter for the development of model.  The main contribution of this paper is to discuss an approach to standardize the dataset and to identify the best parameter which in combination gives best performance. It also investigates a novel corpus augmentation approach to improve the translation quality of English-Hindi bidirectional statistical machine translation system. This model works well for the scarce resource without incorporating the external parallel data corpus of the underlying language.  This experiment is carried out using Open Source phrase-based toolkit Moses. Indian Languages Corpora Initiative (ILCI) Hindi-English tourism corpus is used.  With limited dataset, considerable improvement is achieved using the corpus augmentation approach for the English-Hindi bidirectional SMT system

    Analyzing Use of Thanks to You: Insights for Language Teaching and Assessment in Second and Foreign Language Contexts

    Get PDF
    This investigation of thanks to you in British and American usage was precipitated by a situation at an American university, in which a native Arabic speaker said thanks to you in isolation, making his intended meaning unclear. The study analyzes use of thanks to you in the Corpus of Contemporary American English and the British National Corpus to gain insights for English language instruction /assessment in the American context, as well as English-as-a-lingua-franca contexts where the majority of speakers are not native speakers of English or are speakers of different varieties of English but where American or British English are for educational purposes the standard varieties. Analysis of the two corpora revealed three functions for thanks to you common to British and American usage: expressing gratitude, communicating "because of you" positively, and communicating "because of you" negatively (as in sarcasm). A fourth use of thanks to you, thanking journalists/guests for being on news programs/talk shows, occurred in the American corpus only. Analysis indicates that felicitous use of thanks to you for each of these meanings depends on the presence of a range of factors, both linguistic and material, in the context of utterance
    corecore