8,064 research outputs found
Joining hands: developing a sign language machine translation system with and for the deaf community
This paper discusses the development of an automatic machine translation (MT) system for translating spoken language text into signed languages (SLs). The motivation for our work is the improvement of accessibility to airport information announcements for D/deaf and hard of hearing people. This paper demonstrates the involvement of Deaf colleagues and members of the D/deaf community in Ireland in three areas of our research: the choice of a domain for automatic translation that has a practical use for the D/deaf community; the human translation of English text into Irish Sign Language (ISL) as well as advice on ISL grammar and linguistics; and the importance of native ISL signers as manual evaluators of our translated output
The Effect of Varied Gender Groupings on Argumentation Skills among Middle School Students in Different Cultures
The purpose of this mixed-methods study was to explore the effect of varied gender groupings on argumentation skills among middle school students in Taiwan and the United States in a project-based learning environment that incorporated a graph-oriented computer-assisted application (GOCAA). A total of 43 students comprised the treatment condition and were engaged in the collaborative argumentation process in same-gender groupings. Of these 43 students, 20 were located in the U.S. and 23 were located in Taiwan. A total of 40 students comprised the control condition and were engaged in the collaborative argumentation process in mixed-gender groupings. Of these 40 students, 19 were in the U.S. and 21 were in Taiwan. In each country, verbal collaborative argumentation was recorded and the studentsâ post essays were collected. Among females in Taiwan, one-way analysis of variance (ANOVA) indicated that statistically a significant gender-grouping effect was evident on the total argumentation skills outcome, while MANOVA indicated no significant gender-grouping effect on the combined set of skill outcomes. Among females in the U.S., MANOVA indicated statistically significant gender-grouping effect on the combined set of argumentation skills outcomes Specifically, U.S. female students in mixed-gender groupings (the control condition) significantly outperformed female students in single-gender groupings (the treatment condition) in the counterargument and rebuttal skills. No significant group differences were observed among males. A qualitative analysis was conducted to examine how the graph-oriented computer-assisted application supported studentsâ development of argumentation skills in different gender groupings in both countries. In each country, all teams in both conditions demonstrated a similar pattern of collaborative argumentation with the exception of three female teams in the U.S. Female teams, male teams, (the treatment condition) and mixed-gender teams (the control condition) demonstrated metacognition regulation skills in different degrees and with different scaffolding
An Exploratory Application of Rhetorical Structure Theory to Detect Coherence Errors in L2 English Writing: Possible Implications for Automated Writing Evaluation Software
This paper presents an initial attempt to examine whether Rhetorical Structure Theory (RST) (Mann & Thompson, 1988) can be fruitfully applied to the detection of the coherence errors made by Taiwanese low-intermediate learners of English. This investigation is considered warranted for three reasons. First, other methods for bottom-up coherence analysis have proved ineffective (e.g., Watson Todd et al., 2007). Second, this research provides a preliminary categorization of the coherence errors made by first language (L1) Chinese learners of English. Third, second language discourse errors in general have received little attention in applied linguistic research. The data are 45 written samples from the LTTC English Learner Corpus, a Taiwanese learner corpus of English currently under construction. The rationale of this study is that diagrams which violate some of the rules of RST diagram formation will point to coherence errors. No reliability test has been conducted since this work is at an initial stage. Therefore, this study is exploratory and results are preliminary. Results are discussed in terms of the practicality of using this method to detect coherence errors, their possible consequences about claims for a typical inductive content order in the writing of L1 Chinese learners of English, and their potential implications for Automated Writing Evaluation (AWE) software, since discourse organization is one of the essay characteristics assessed by this software. In particular, the extent to which the kinds of errors detected through the RST analysis match those located by Criterion (Burstein, Chodorow, & Leachock, 2004), a well-known AWE software by Educational Testing Service (ETS), is discussed
MAC: A unified framework boosting low resource automatic speech recognition
We propose a unified framework for low resource automatic speech recognition
tasks named meta audio concatenation (MAC). It is easy to implement and can be
carried out in extremely low resource environments. Mathematically, we give a
clear description of MAC framework from the perspective of bayesian sampling.
In this framework, we leverage a novel concatenative synthesis text-to-speech
system to boost the low resource ASR task. By the concatenative synthesis
text-to-speech system, we can integrate language pronunciation rules and adjust
the TTS process. Furthermore, we propose a broad notion of meta audio set to
meet the modeling needs of different languages and different scenes when using
the system. Extensive experiments have demonstrated the great effectiveness of
MAC on low resource ASR tasks. For CTC greedy search, CTC prefix, attention,
and attention rescoring decode mode in Cantonese ASR task, Taiwanese ASR task,
and Japanese ASR task the MAC method can reduce the CER by more than 15\%.
Furthermore, in the ASR task, MAC beats wav2vec2 (with fine-tuning) on common
voice datasets of Cantonese and gets really competitive results on common voice
datasets of Taiwanese and Japanese. Among them, it is worth mentioning that we
achieve a \textbf{10.9\%} character error rate (CER) on the common voice
Cantonese ASR task, bringing about \textbf{30\%} relative improvement compared
to the wav2vec2 (with fine-tuning)
2kenize: Tying Subword Sequences for Chinese Script Conversion
Simplified Chinese to Traditional Chinese character conversion is a common
preprocessing step in Chinese NLP. Despite this, current approaches have poor
performance because they do not take into account that a simplified Chinese
character can correspond to multiple traditional characters. Here, we propose a
model that can disambiguate between mappings and convert between the two
scripts. The model is based on subword segmentation, two language models, as
well as a method for mapping between subword sequences. We further construct
benchmark datasets for topic classification and script conversion. Our proposed
method outperforms previous Chinese Character conversion approaches by 6 points
in accuracy. These results are further confirmed in a downstream application,
where 2kenize is used to convert pretraining dataset for topic classification.
An error analysis reveals that our method's particular strengths are in dealing
with code-mixing and named entities.Comment: Accepted to ACL 202
A Study on Implementation of Southern-Min Taiwanese Tone Sandhi System
PACLIC 19 / Taipei, taiwan / December 1-3, 200
- âŚ