6,711 research outputs found

    Spartan Daily January 25, 2012

    Get PDF
    Volume 138, Issue 1https://scholarworks.sjsu.edu/spartandaily/1000/thumbnail.jp

    Automatic Pronunciation Assessment -- A Review

    Full text link
    Pronunciation assessment and its application in computer-aided pronunciation training (CAPT) have seen impressive progress in recent years. With the rapid growth in language processing and deep learning over the past few years, there is a need for an updated review. In this paper, we review methods employed in pronunciation assessment for both phonemic and prosodic. We categorize the main challenges observed in prominent research trends, and highlight existing limitations, and available resources. This is followed by a discussion of the remaining challenges and possible directions for future work.Comment: 9 pages, accepted to EMNLP Finding

    The elastic use of 'some': a comparative study between l1 and l2 speakers in educational settings

    Get PDF
    This study explored some using a refreshing approach: focusing on its elasticity. It was a comparative study of L1 (American) and L2 (Chinese and Vietnamese) speakers and found that L2 speakers are vaguer than L1 speakers, and that the elasticity of some is manifested through the fluid, stretchable and strategic features of some’s pragmatic meanings and functions. The implication is that an understanding of its elastic nature may be integrated into the curriculum of English language teaching

    Austronesian and other languages of the Pacific and South-east Asia : an annotated catalogue of theses and dissertations

    Get PDF

    VNHSGE: VietNamese High School Graduation Examination Dataset for Large Language Models

    Full text link
    The VNHSGE (VietNamese High School Graduation Examination) dataset, developed exclusively for evaluating large language models (LLMs), is introduced in this article. The dataset, which covers nine subjects, was generated from the Vietnamese National High School Graduation Examination and comparable tests. 300 literary essays have been included, and there are over 19,000 multiple-choice questions on a range of topics. The dataset assesses LLMs in multitasking situations such as question answering, text generation, reading comprehension, visual question answering, and more by including both textual data and accompanying images. Using ChatGPT and BingChat, we evaluated LLMs on the VNHSGE dataset and contrasted their performance with that of Vietnamese students to see how well they performed. The results show that ChatGPT and BingChat both perform at a human level in a number of areas, including literature, English, history, geography, and civics education. They still have space to grow, though, especially in the areas of mathematics, physics, chemistry, and biology. The VNHSGE dataset seeks to provide an adequate benchmark for assessing the abilities of LLMs with its wide-ranging coverage and variety of activities. We intend to promote future developments in the creation of LLMs by making this dataset available to the scientific community, especially in resolving LLMs' limits in disciplines involving mathematics and the natural sciences.Comment: 74 pages, 44 figure

    Collected papers on Southeast Asian and Pacific languages

    Get PDF

    Automatic Speech Recognition for Low-resource Languages and Accents Using Multilingual and Crosslingual Information

    Get PDF
    This thesis explores methods to rapidly bootstrap automatic speech recognition systems for languages, which lack resources for speech and language processing. We focus on finding approaches which allow using data from multiple languages to improve the performance for those languages on different levels, such as feature extraction, acoustic modeling and language modeling. Under application aspects, this thesis also includes research work on non-native and Code-Switching speech
    • …
    corecore