Search CORE

6,711 research outputs found

Spartan Daily January 25, 2012

Author: San Jose State University School of Journalism and Mass Communications
Publication venue: SJSU ScholarWorks
Publication date: 25/01/2012
Field of study

Volume 138, Issue 1https://scholarworks.sjsu.edu/spartandaily/1000/thumbnail.jp

SJSU ScholarWorks

Automatic Pronunciation Assessment -- A Review

Author: Ali Ahmed
Chowdhury Shammur Absar
Kheir Yassine El
Publication venue
Publication date: 21/10/2023
Field of study

Pronunciation assessment and its application in computer-aided pronunciation training (CAPT) have seen impressive progress in recent years. With the rapid growth in language processing and deep learning over the past few years, there is a need for an updated review. In this paper, we review methods employed in pronunciation assessment for both phonemic and prosodic. We categorize the main challenges observed in prominent research trends, and highlight existing limitations, and available resources. This is followed by a discussion of the remaining challenges and possible directions for future work.Comment: 9 pages, accepted to EMNLP Finding

arXiv.org e-Print Archive

The elastic use of 'some': a comparative study between l1 and l2 speakers in educational settings

Author: Le Nhu Nguyet
Publication venue: Curtin University
Publication date: 01/01/2015
Field of study

This study explored some using a refreshing approach: focusing on its elasticity. It was a comparative study of L1 (American) and L2 (Chinese and Vietnamese) speakers and found that L2 speakers are vaguer than L1 speakers, and that the elasticity of some is manifested through the fluid, stretchable and strategic features of some’s pragmatic meanings and functions. The implication is that an understanding of its elastic nature may be integrated into the curriculum of English language teaching

espace@Curtin

Austronesian and other languages of the Pacific and South-east Asia : an annotated catalogue of theses and dissertations

Author: Coppell W. G.
Publication venue: Dept. of Linguistics, Research School of Pacific Studies, The Australian National University
Publication date: 01/01/1981
Field of study

Ezid

The Australian National University

MPG.PuRe

Linguistics in East Asia and South East Asia

Author: Noss Richard B.
Yamagiwa Joseph. K.
Yuen Ren Chao
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 27/06/2019
Field of study

Directory of Open Access Books (DOAB)

VNHSGE: VietNamese High School Graduation Examination Dataset for Large Language Models

Author: Bac-Bien Ngo
Hong-Phuoc Nguyen
Ngoc-Bich Le
The-Duy Vo
Thi-My-Thanh Nguyen
Van-Tien Nguyen
Xuan-Dung Phan
Xuan-Quy Dao
Publication venue
Publication date: 20/05/2023
Field of study

The VNHSGE (VietNamese High School Graduation Examination) dataset, developed exclusively for evaluating large language models (LLMs), is introduced in this article. The dataset, which covers nine subjects, was generated from the Vietnamese National High School Graduation Examination and comparable tests. 300 literary essays have been included, and there are over 19,000 multiple-choice questions on a range of topics. The dataset assesses LLMs in multitasking situations such as question answering, text generation, reading comprehension, visual question answering, and more by including both textual data and accompanying images. Using ChatGPT and BingChat, we evaluated LLMs on the VNHSGE dataset and contrasted their performance with that of Vietnamese students to see how well they performed. The results show that ChatGPT and BingChat both perform at a human level in a number of areas, including literature, English, history, geography, and civics education. They still have space to grow, though, especially in the areas of mathematics, physics, chemistry, and biology. The VNHSGE dataset seeks to provide an adequate benchmark for assessing the abilities of LLMs with its wide-ranging coverage and variety of activities. We intend to promote future developments in the creation of LLMs by making this dataset available to the scientific community, especially in resolving LLMs' limits in disciplines involving mathematics and the natural sciences.Comment: 74 pages, 44 figure

arXiv.org e-Print Archive

Collected papers on Southeast Asian and Pacific languages

Author: Bauer Robert S.
Publication venue: Pacific Linguistics, Research School of Pacific and Asian Studies, The Australian National University
Publication date: 01/01/2002
Field of study

Ezid

The Australian National University

Construing the ecological perspective of the Tai Dam as seen in ‘Sen Huen’ ritual manuscripts

Author
Publication venue: Springer
Publication date: 04/10/2016
Field of study

Springer - Publisher Connector

Automatic Speech Recognition for Low-resource Languages and Accents Using Multilingual and Crosslingual Information

Author: Vu Ngoc Thang
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2014
Field of study

This thesis explores methods to rapidly bootstrap automatic speech recognition systems for languages, which lack resources for speech and language processing. We focus on finding approaches which allow using data from multiple languages to improve the performance for those languages on different levels, such as feature extraction, acoustic modeling and language modeling. Under application aspects, this thesis also includes research work on non-native and Code-Switching speech

KITopen

Recommended from our members

Cross-generational linguistic variation in the Canberra Vietnamese heritage language community: A corpus-centred investigation

Author: Nguyen Li
Publication venue: University of Cambridge
Publication date: 06/10/2020
Field of study

This dissertation investigates cross-generational linguistic differences in the Canberra Vietnamese bilingual community, with a particular focus on Vietnamese as the heritage language. Specifically, it documents the vernacular and considers key aspects of this data from different theoretical perspectives. Its main contribution is an insight into a rarely studied heritage language variety in a contact community that has never been examined. The dissertation consists of five core chapters, organised into two parts. In the first part (Chapters 2–3), I describe how I documented the vernacular and created the Canberra Vietnamese English Corpus (CanVEC), an original corpus compiled specifically for this study that is also the first to be freely available for research purposes. The corpus consists of over ten hours of spontaneous speech produced by 45 Vietnamese-English bilingual speakers across two generations living in Canberra. In the second part of the study (Chapters 4–6), I put the corpus to use and investigate aspects of the cross-generational differences in Vietnamese as the heritage language in this community. In particular, I first probe the Vietnamese heritage language via its participation in the code-switching discourse (Chapter 4). In doing so, I focus on the applicability of the Matrix Language Framework (MLF) (Myers-Scotton, 1993, 2002) and its associated Matrix Language (ML) Turnover Hypothesis (Myers-Scotton, 1998) to the code-switching data in CanVEC. Since support for this prominent model has mainly come from language pairs that have different clausal word order or vastly different inventories of inflectional morphology, Vietnamese-English as a pair in which both languages are SVO and essentially isolating offers a tantalising testing ground for its application. Results show that the universal claims of this model do not hold so straight-forwardly. CanVEC data challenges several assumptions of the MLF, with the model ultimately only being able to account for around half of the CanVEC code-switching data. I further demonstrate that even when the ML is putatively identifiable and a cross-generational ML ‘turnover’ is quantitatively observed, the predictions do not reflect the direction of structural influence that we see in CanVEC. The MLF approach therefore sheds only limited light on cross-generational language shift and variation in this community. Given that null elements emerge as a distinct area of difficulty in Chapter 4, I take this aspect as the focal point for the next part of the investigation (Chapter 5), where I use the variationist approach (Labov, 1972 et seq.) to explore three cases where null and overt realisation alternates in Vietnamese: subjects, objects, and copulas. In doing so, I move away from the bilingual portion of CanVEC to examine the monolingual heritage Vietnamese subset directly. Results show that Vietnamese null subjects vary significantly across generations, while null objects and copulas remain stable in terms of use. As speakers also overwhelmingly prefer overt forms over null forms (∼70:30) across all the three of the variables of interest, I appeal to the generative interface-oriented approach (Sorace & Filiaci, 2006 et seq.) to next examine the distribution of overt subjects, objects, copulas (Chapter 6). These results converge with what was found for null forms: cross-generational effects were observed for pronominal subjects, but not pronominal objects and copulas. This finding also supports the importance of a distinction drawn in previous works between internal (syntax-semantics) and external (syntax-discourse/pragmatics) interface phenomena, with the latter being seemingly more susceptible to change. Ultimately, this dissertation highlights the empirical and theoretical value of studying rarely considered contact varieties, while deploying an integrated approach that acknowledges the multi-faceted complexity of the contact communities where these varieties are spoken.Cambridge Trust International Scholarshi

Apollo (Cambridge)