4,184 research outputs found
An Efficient Probabilistic Deep Learning Model for the Oral Proficiency Assessment of Student Speech Recognition and Classification
Natural Language Processing is a branch of artificial intelligence (AI) that focuses on the interaction between computers and human language. Speech recognition systems utilize machine learning algorithms and statistical models to analyze acoustic features of speech, such as pitch, duration, and frequency, to convert spoken words into written text. The Student English Oral Proficiency Assessment and Feedback System provides students with a comprehensive evaluation of their spoken English skills and offers tailored feedback to help them improve. It can be used in language learning institutions, universities, or online platforms to support language education and enhance oral communication abilities. In this paper constructed a framework stated as Latent Dirichlet Integrated Deep Learning (LDiDL) for the assessment of student English proficiency assessment. The system begins by collecting a comprehensive dataset of spoken English samples, encompassing various proficiency levels. Relevant features are extracted from the samples, including acoustic characteristics and linguistic attributes. Leveraging Latent Dirichlet Allocation (LDA), the system uncovers latent topics within the data, enabling a deeper understanding of the underlying themes present in the spoken English. To further enhance the analysis, a deep learning model is developed, integrating the LDA topics with the extracted features. This model is trained using appropriate techniques and evaluated using performance metrics. Utilizing the predictions made by the model, the system generates personalized feedback for each student, focusing on areas of improvement such as vocabulary, grammar, fluency, and pronunciation. Simulation mode uses the native English speech audio for the LDiDL training and classification. The experimental analysis stated that the proposed LDiDL model achieves an accuracy of 99% for the assessment of English Proficiency
多重分解能のポステリオグラムを用いた日本人英語を対 象とした流暢性推定と韻律誤り分析
学位の種別: 修士University of Tokyo(東京大学
Recommended from our members
Deep Learning for Automatic Assessment and Feedback of Spoken English
Growing global demand for learning a second language (L2), particularly English, has led to
considerable interest in automatic spoken language assessment, whether for use in computerassisted language learning (CALL) tools or for grading candidates for formal qualifications.
This thesis presents research conducted into the automatic assessment of spontaneous nonnative English speech, with a view to be able to provide meaningful feedback to learners. One
of the challenges in automatic spoken language assessment is giving candidates feedback on
particular aspects, or views, of their spoken language proficiency, in addition to the overall
holistic score normally provided. Another is detecting pronunciation and other types of errors
at the word or utterance level and feeding them back to the learner in a useful way.
It is usually difficult to obtain accurate training data with separate scores for different
views and, as examiners are often trained to give holistic grades, single-view scores can
suffer issues of consistency. Conversely, holistic scores are available for various standard
assessment tasks such as Linguaskill. An investigation is thus conducted into whether
assessment scores linked to particular views of the speaker’s ability can be obtained from
systems trained using only holistic scores.
End-to-end neural systems are designed with structures and forms of input tuned to single
views, specifically each of pronunciation, rhythm, intonation and text. By training each
system on large quantities of candidate data, individual-view information should be possible
to extract. The relationships between the predictions of each system are evaluated to examine
whether they are, in fact, extracting different information about the speaker. Three methods
of combining the systems to predict holistic score are investigated, namely averaging their
predictions and concatenating and attending over their intermediate representations. The
combined graders are compared to each other and to baseline approaches.
The tasks of error detection and error tendency diagnosis become particularly challenging
when the speech in question is spontaneous and particularly given the challenges posed by
the inconsistency of human annotation of pronunciation errors. An approach to these tasks is
presented by distinguishing between lexical errors, wherein the speaker does not know how a
particular word is pronounced, and accent errors, wherein the candidate’s speech exhibits
consistent patterns of phone substitution, deletion and insertion. Three annotated corpora
x
of non-native English speech by speakers of multiple L1s are analysed, the consistency of
human annotation investigated and a method presented for detecting individual accent and
lexical errors and diagnosing accent error tendencies at the speaker level
Large Language Models for Difficulty Estimation of Foreign Language Content with Application to Language Learning
We use large language models to aid learners enhance proficiency in a foreign
language. This is accomplished by identifying content on topics that the user
is interested in, and that closely align with the learner's proficiency level
in that foreign language. Our work centers on French content, but our approach
is readily transferable to other languages. Our solution offers several
distinctive characteristics that differentiate it from existing
language-learning solutions, such as, a) the discovery of content across topics
that the learner cares about, thus increasing motivation, b) a more precise
estimation of the linguistic difficulty of the content than traditional
readability measures, and c) the availability of both textual and video-based
content. The linguistic complexity of video content is derived from the video
captions. It is our aspiration that such technology will enable learners to
remain engaged in the language-learning process by continuously adapting the
topics and the difficulty of the content to align with the learners' evolving
interests and learning objectives
The Influence of Implementing Communicative Approach in the Language Teaching Process on Students’ Academic Achievement
This research is aimed at determining the effect of implementing communicative approach in language teaching on the students’ learning outcomes. The research was conducted in Blitar, East Java, Indonesia. The research employed descriptive correlation design and 40 elementary school teachers were selected as samples by random sampling technique. Data collection technique used questionnaires and documentation, and the analysis technique employed descriptive statistics and Pearson Product Moment correlation. The results indicated that implementing communicative approach in the language teaching did not significantly influence the students’ learning outcomes in the national examination. If it was seen from the items of activities which had been performed in the learning process through communicative approach, there were some items that have significant influence towards students’ learning outcomes in social science
Fluency Strategy Training and the L2 Oral Task Performance of Indonesian EFL Classroom Learners
This quasi-experimental study investigated the impacts of two instructional conditions, explicit fluency strategy training and implicit task-based instruction, on university English learners in Indonesia. The results revealed that both instructional conditions could not significantly improve participants’ speech fluency, but improvement on oral proficiency reached statistical significance. A degree of variability in participants’ speech fluency development was also found. Both instructional conditions could be applied with potentially complementary effects in Indonesian EFL classrooms
Re-examining Phonological and Lexical Correlates of Second Language Comprehensibility:The Role of Rater Experience
Few researchers and teachers would disagree that some linguistic aspects
of second language (L2) speech are more crucial than others for successful
communication. Underlying this idea is the assumption that communicative
success can be broadly defined in terms of speakers’ ability to convey the
intended meaning to the interlocutor, which is frequently captured through
a listener-based rating of comprehensibility or ease of understanding (e.g.
Derwing & Munro, 2009; Levis, 2005). Previous research has shown that
communicative success – for example, as defined through comprehensible L2
speech – depends on several linguistic dimensions of L2 output, including its
segmental and suprasegmental pronunciation, fluency-based characteristics,
lexical and grammatical content, as well as discourse structure (e.g. Field,
2005; Hahn, 2004; Kang et al., 2010; Trofimovich & Isaacs, 2012). Our chief
objective in the current study was to explore the L2 comprehensibility construct from a language assessment perspective (e.g. Isaacs & Thomson, 2013),
by targeting rater experience as a possible source of variance influencing the
degree to which raters use various characteristics of speech in judging L2
comprehensibility. In keeping with this objective, we asked the following
question: What is the extent to which linguistic aspects of L2 speech contributing to comprehensibility ratings depend on raters’ experience
Artificial Intelligence for Multimedia Signal Processing
Artificial intelligence technologies are also actively applied to broadcasting and multimedia processing technologies. A lot of research has been conducted in a wide variety of fields, such as content creation, transmission, and security, and these attempts have been made in the past two to three years to improve image, video, speech, and other data compression efficiency in areas related to MPEG media processing technology. Additionally, technologies such as media creation, processing, editing, and creating scenarios are very important areas of research in multimedia processing and engineering. This book contains a collection of some topics broadly across advanced computational intelligence algorithms and technologies for emerging multimedia signal processing as: Computer vision field, speech/sound/text processing, and content analysis/information mining
Automatic Scaling of Text for Training Second Language Reading Comprehension
For children learning their first language, reading is one of the most effective ways to acquire new vocabulary. Studies link students who read more with larger and more complex vocabularies. For second language learners, there is a substantial barrier to reading. Even the books written for early first language readers assume a base vocabulary of nearly 7000 word families and a nuanced understanding of grammar. This project will look at ways that technology can help second language learners overcome this high barrier to entry, and the effectiveness of learning through reading for adults acquiring a foreign language. Through the implementation of Dokusha, an automatic graded reader generator for Japanese, this project will explore how advancements in natural language processing can be used to automatically simplify text for extensive reading in Japanese as a foreign language
- …