3,509 research outputs found
A novel lip geometry approach for audio-visual speech recognition
By identifying lip movements and characterizing their associations with speech sounds, the performance of speech recognition systems can be improved, particularly when operating in noisy environments. Various method have been studied by research group around the world to incorporate lip movements into speech recognition in recent years, however exactly how best to incorporate ,the additional visual information is still not known. This study aims to extend the knowledge of relationships between visual and speech information specifically using lip geometry information due to its robustness to head rotation and the fewer number of features required to represent movement. A new method has been developed to extract lip geometry information, to perform classification and to integrate visual and speech modalities. This thesis makes several contributions. First, this work presents a new method to extract lip geometry features using the combination ofa skin colour filter, a border following algorithm and a convex hull approach. The proposed method was found to improve lip shape extraction performance compared to existing approaches. Lip geometry features including height, width, ratio, area, perimeter and various combinations of these features were evaluated to determine which performs best when representing speech in the visual domain. Second, a novel template matching techniqLie able to adapt dynamic differences in the way words are uttered by speakers has been developed, which determines the best fit of an unseen feature signal to those stored in a database
template. Third, following on evaluation of integration strategies, a novel method has been developed based on alternative decision fusion strategy, in which the outcome from the visual and speech modality is chosen by measuring the quality of audio based on kurtosis and skewness analysis and driven by white noise confusion. Finally, the performance of the new methods introduced in this work are evaluated using the CUAVE and LUNA-V data corpora under a range of different signal to noise ratio conditions using the NOISEX-92 dataset
Trialing project-based learning in a new EAP ESP course: A collaborative reflective practice of three college English teachers
Currently in many Chinese universities, the traditional College English course is facing the risk of being ‘marginalized’, replaced or even removed, and many hours previously allocated to the course are now being taken by EAP or ESP. At X University in northern China, a curriculum reform as such is taking place, as a result of which a new course has been created called ‘xue ke’ English. Despite the fact that ‘xue ke’ means subject literally, the course designer has made it clear that subject content is not the target, nor is the course the same as EAP or ESP. This curriculum initiative, while possibly having been justified with a rationale of some kind (e.g. to meet with changing social and/or academic needs of students and/or institutions), this is posing a great challenge for, as well as considerable pressure on, a number of College English teachers who have taught this single course for almost their entire teaching career. In such a context, three teachers formed a peer support group in Semester One this year, to work collaboratively co-tackling the challenge, and they chose Project-Based Learning (PBL) for the new course. This presentation will report on the implementation of this project, including the overall designing, operational procedure, and the teachers’ reflections.
Based on discussion, pre-agreement was reached on the purpose and manner of collaboration as offering peer support for more effective teaching and learning and fulfilling and pleasant professional development. A WeChat group was set up as the chief platform for messaging, idea-sharing, and resource-exchanging. Physical meetings were supplementary, with sound agenda but flexible time, and venues. Mosoteach cloud class (lan mo yun ban ke) was established as a tool for virtual learning, employed both in and after class. Discussions were held at the beginning of the semester which determined only brief outlines for PBL implementation and allowed space for everyone to autonomously explore in their own way. Constant further discussions followed, which generated a great deal of opportunities for peer learning and lesson plan modifications. A reflective journal, in a greater or lesser detailed manner, was also kept by each teacher to record the journey of the collaboration. At the end of the semester, it was commonly recognized that, although challenges existed, the collaboration was overall a success and they were all willing to continue with it and endeavor to refine it to be a more professional and productive approach
Recommended from our members
Deep Learning for Automatic Assessment and Feedback of Spoken English
Growing global demand for learning a second language (L2), particularly English, has led to
considerable interest in automatic spoken language assessment, whether for use in computerassisted language learning (CALL) tools or for grading candidates for formal qualifications.
This thesis presents research conducted into the automatic assessment of spontaneous nonnative English speech, with a view to be able to provide meaningful feedback to learners. One
of the challenges in automatic spoken language assessment is giving candidates feedback on
particular aspects, or views, of their spoken language proficiency, in addition to the overall
holistic score normally provided. Another is detecting pronunciation and other types of errors
at the word or utterance level and feeding them back to the learner in a useful way.
It is usually difficult to obtain accurate training data with separate scores for different
views and, as examiners are often trained to give holistic grades, single-view scores can
suffer issues of consistency. Conversely, holistic scores are available for various standard
assessment tasks such as Linguaskill. An investigation is thus conducted into whether
assessment scores linked to particular views of the speaker’s ability can be obtained from
systems trained using only holistic scores.
End-to-end neural systems are designed with structures and forms of input tuned to single
views, specifically each of pronunciation, rhythm, intonation and text. By training each
system on large quantities of candidate data, individual-view information should be possible
to extract. The relationships between the predictions of each system are evaluated to examine
whether they are, in fact, extracting different information about the speaker. Three methods
of combining the systems to predict holistic score are investigated, namely averaging their
predictions and concatenating and attending over their intermediate representations. The
combined graders are compared to each other and to baseline approaches.
The tasks of error detection and error tendency diagnosis become particularly challenging
when the speech in question is spontaneous and particularly given the challenges posed by
the inconsistency of human annotation of pronunciation errors. An approach to these tasks is
presented by distinguishing between lexical errors, wherein the speaker does not know how a
particular word is pronounced, and accent errors, wherein the candidate’s speech exhibits
consistent patterns of phone substitution, deletion and insertion. Three annotated corpora
x
of non-native English speech by speakers of multiple L1s are analysed, the consistency of
human annotation investigated and a method presented for detecting individual accent and
lexical errors and diagnosing accent error tendencies at the speaker level
Automatic identification of terms for the generation of students’ concept maps
Proceedings of the 4th International Conference on Multimedia and Information and Communication Technologies in Education, M-icte 2006, held in Seville (Spain) on November 2006Willow, an adaptive multilingual free-text Computer-Assisted Assessment system, automatically
evaluates students’ free-text answers given a set of correct ones. This paper presents an extension of the
system in order to generate the students’ concept maps while they are being assessed. To that aim, a new
module for the automatic identification of the terms of a particular knowledge field has been created. It
identifies and keeps track of the terms that are being used in the students’ answers, and calculates a confidence
score of the student's knowledge about each term. An empyrical evaluation using the students' real
answers show that it is robust enough to generate a good set of terms from a very small set of answers.This work has been sponsored by Spanish Ministry of Science and Technology, project number TIN2004-0314
- …