224 research outputs found

    Comparing the production of a formula with the development of L2 competence

    Get PDF
    This pilot study investigates the production of a formula with the development of L2 competence over proficiency levels of a spoken learner corpus. The results show that the formula in beginner production data is likely being recalled holistically from learners’ phonological memory rather than generated online, identifiable by virtue of its fluent production in absence of any other surface structure evidence of the formula’s syntactic properties. As learners’ L2 competence increases, the formula becomes sensitive to modifications which show structural conformity at each proficiency level. The transparency between the formula’s modification and learners’ corresponding L2 surface structure realisations suggest that it is the independent development of L2 competence which integrates the formula into compositional language, and ultimately drives the SLA process forward

    HCLAS-X: Hierarchical and Cascaded Lyrics Alignment System Using Multimodal Cross-Correlation

    Full text link
    In this work, we address the challenge of lyrics alignment, which involves aligning the lyrics and vocal components of songs. This problem requires the alignment of two distinct modalities, namely text and audio. To overcome this challenge, we propose a model that is trained in a supervised manner, utilizing the cross-correlation matrix of latent representations between vocals and lyrics. Our system is designed in a hierarchical and cascaded manner. It predicts synced time first on a sentence-level and subsequently on a word-level. This design enables the system to process long sequences, as the cross-correlation uses quadratic memory with respect to sequence length. In our experiments, we demonstrate that our proposed system achieves a significant improvement in mean average error, showcasing its robustness in comparison to the previous state-of-the-art model. Additionally, we conduct a qualitative analysis of the system after successfully deploying it in several music streaming services

    Machine Learning Algorithm for the Scansion of Old Saxon Poetry

    Get PDF
    Several scholars designed tools to perform the automatic scansion of poetry in many languages, but none of these tools deal with Old Saxon or Old English. This project aims to be a first attempt to create a tool for these languages. We implemented a Bidirectional Long Short-Term Memory (BiLSTM) model to perform the automatic scansion of Old Saxon and Old English poems. Since this model uses supervised learning, we manually annotated the Heliand manuscript, and we used the resulting corpus as labeled dataset to train the model. The evaluation of the performance of the algorithm reached a 97% for the accuracy and a 99% of weighted average for precision, recall and F1 Score. In addition, we tested the model with some verses from the Old Saxon Genesis and some from The Battle of Brunanburh, and we observed that the model predicted almost all Old Saxon metrical patterns correctly misclassified the majority of the Old English input verses

    A computer-assisted pproach to the comparison of mainland southeast Asian languages

    Get PDF
    This cumulative thesis is based on three separate projects based on a computer-assisted language comparison (CALC) framework to address common obstacles to studying the history of Mainland Southeast Asian (MSEA) languages, such as sparse and non-standardized lexical data, as well as an inadequate method of cognate judgments, and to provide caveats to scholars who will use Bayesian phylogenetic analysis. The first project provides a format that standardizes the sound inventories, regulates language labels, and clarifies lexical items. This standardized format allows us to merge various forms of raw data. The format also summarizes information to assist linguists in researching the relatedness among words and inferring relationships among languages. The second project focuses on increasing the transparency of lexical data and cognate judg- ments with regard to compound words. The method enables the annotation of each part of a word with semantic meanings and syntactic features. In addition, four different conversion methods were developed to convert morpheme cognates into word cognates for input into the Bayesian phylogenetic analysis. The third project applies the methods used in the first project to create a workflow by merging linguistic data sets and inferring a language tree using a Bayesian phylogenetic algorithm. Further- more, the project addresses the importance of integrating cross-disciplinary studies into historical linguistic research. Finally, the methods we proposed for managing lexical data for MSEA languages are discussed and summarized in six perspectives. The work can be seen as a milestone in reconstructing human prehistory in an area that has high linguistic and cultural diversity

    Generative Input: Towards Next-Generation Input Methods Paradigm

    Full text link
    Since the release of ChatGPT, generative models have achieved tremendous success and become the de facto approach for various NLP tasks. However, its application in the field of input methods remains under-explored. Many neural network approaches have been applied to the construction of Chinese input method engines(IMEs).Previous research often assumed that the input pinyin was correct and focused on Pinyin-to-character(P2C) task, which significantly falls short of meeting users' demands. Moreover, previous research could not leverage user feedback to optimize the model and provide personalized results. In this study, we propose a novel Generative Input paradigm named GeneInput. It uses prompts to handle all input scenarios and other intelligent auxiliary input functions, optimizing the model with user feedback to deliver personalized results. The results demonstrate that we have achieved state-of-the-art performance for the first time in the Full-mode Key-sequence to Characters(FK2C) task. We propose a novel reward model training method that eliminates the need for additional manual annotations and the performance surpasses GPT-4 in tasks involving intelligent association and conversational assistance. Compared to traditional paradigms, GeneInput not only demonstrates superior performance but also exhibits enhanced robustness, scalability, and online learning capabilities

    The Taming of Irrationality : An Attempt at Secularizing an Orthography with Religious Connotation among the Lisu in Thailand

    Get PDF
    This paper aims at articulating the domestic politics over the secularization of an orthography with religious connotation, or the Fraser Script, in the context of the cultural revitalization movement among the Lisu in Thailand. The data and discourses shown here primarily derive from the author’s community-based participatory research over two decades, and from previous works of anthropologists, linguists, and Christian missionaries on the Lisu and their orthographies as well.Regardless of preceding cleavage between the Lisu who observe rituals based on spiritual beliefs and the Lisu who converted to Christianity, the two groups are now beginning to reconcile by sharing common problem consciousness, and by secularizing the Fraser Script as common orthography beyond religious beliefs. However, as the Christians, with their better access to Fraser Script and transnational Christian network, expanding political influence through pan-Lisu networking in China, Myanmar and Thailand, the on-going shift of the Fraser Script to a neutral position seems to be subject to oscillation between secularization anddesecularization

    World History, Volume 1: To 1500

    Get PDF
    World History, Volume 1: to 1500 is designed to meet the scope and sequence of a world history course to 1500 offered at both two-year and four-year institutions. Suitable for both majors and non majors World History, Volume 1: to 1500 introduces students to a global perspective of history couched in an engaging narrative. Concepts and assessments help students think critically about the issues they encounter so they can broaden their perspective of global history. A special effort has been made to introduce and juxtapose people’s experiences of history for a rich and nuanced discussion. Primary source material represents the cultures being discussed from a firsthand perspective whenever possible. World History, Volume 1: to 1500 also includes the work of diverse and underrepresented scholars to ensure a full range of perspectives

    Open-vocabulary keyword spotting in any language through multilingual contrastive speech-phoneme pretraining

    Full text link
    In this paper, we introduce a massively multilingual speech corpora with fine-grained phonemic transcriptions, encompassing more than 115 languages from diverse language families. Based on this multilingual dataset, we propose CLAP-IPA, a multilingual phoneme-speech contrastive embedding model capable of open-vocabulary matching between speech signals and phonemically transcribed keywords or arbitrary phrases. The proposed model has been tested on two fieldwork speech corpora in 97 unseen languages, exhibiting strong generalizability across languages. Comparison with a text-based model shows that using phonemes as modeling units enables much better crosslinguistic generalization than orthographic texts.Comment: Preprint; Work in Progres

    Program and Proceedings: The Nebraska Academy of Sciences 1880-2023. 142th Anniversary Year. One Hundred-Thirty-Third Annual Meeting April 21, 2023. Hybrid Meeting: Nebraska Wesleyan University & Online, Lincoln, Nebraska

    Get PDF
    AERONAUTICS & SPACE SCIENCE Chairperson(s): Dr. Scott Tarry & Michaela Lucas HUMANS PAST AND PRESENT Chairperson(s): Phil R. Geib & Allegra Ward APPLIED SCIENCE & TECHNOLOGY SECTION Chairperson(s): Mary Ettel BIOLOGY Chairpersons: Lauren Gillespie, Steve Heinisch, and Paul Davis BIOMEDICAL SCIENCES Chairperson(s): Annemarie Shibata, Kimberly Carlson, Joseph Dolence, Alexis Hobbs, James Fletcher, Paul Denton CHEM Section Chairperson(s): Nathanael Fackler EARTH SCIENCES Chairpersons: Irina Filina, Jon Schueth, Ross Dixon, Michael Leite ENVIRONMENTAL SCIENCE Chairperson: Mark Hammer PHYSICS Chairperson(s): Dr. Adam Davis SCIENCE EDUCATION Chairperson: Christine Gustafson 2023 Maiben Lecturer: Jason Bartz 2023 FRIEND OF SCIENCE AWARD TO: Ray Ward and Jim Lewi
    corecore