Search CORE

119 research outputs found

Robust Parsing for Ungrammatical Sentences

Author: Baradaran Hashemi Homa
Publication venue
Publication date: 31/01/2018
Field of study

Natural Language Processing (NLP) is a research area that specializes in studying computational approaches to human language. However, not all of the natural language sentences are grammatically correct. Sentences that are ungrammatical, awkward, or too casual/colloquial tend to appear in a variety of NLP applications, from product reviews and social media analysis to intelligent language tutors or multilingual processing. In this thesis, we focus on parsing, because it is an essential component of many NLP applications. We investigate in what ways the performances of statistical parsers degrade when dealing with ungrammatical sentences. We also hypothesize that breaking up parse trees from problematic parts prevents NLP applications from degrading due to incorrect syntactic analysis. A parser is robust if it can overlook problems such as grammar mistakes and produce a parse tree that closely resembles the correct analysis for the intended sentence. We develop a robustness evaluation metric and conduct a series of experiments to compare the performances of state-of-the-art parsers on the ungrammatical sentences. The evaluation results show that ungrammatical sentences present challenges for statistical parsers, because the well-formed syntactic trees they produce may not be appropriate for ungrammatical sentences. We also define a new framework for reviewing the parses of ungrammatical sentences and extracting the coherent parts whose syntactic analyses make sense. We call this task parse tree fragmentation. The experimental results suggest that the proposed overall fragmentation framework is a promising way to handle syntactically unusual sentences

D-Scholarship@Pitt

Syntactic modification at early stages of L2 German writing development: A longitudinal learner corpus study

Author: Vyatkina Nina
Publication venue: 'Elsevier BV'
Publication date: 22/06/2018
Field of study

This study explores ab initio development of syntactic complexity in a longitudinal corpus of learner German writing from a Dynamic Usage-Based perspective. It contributes to the research on L2 writing complexity by focusing on beginning learners of an L2 other than English (German) and on fine-grained measures of syntactic complexity, operationally defined here as syntactic modification. The results show that not only ubiquitous global measures of syntactic complexity but also more specific measures, namely frequencies of syntactic modifiers, can serve as developmental indices at beginning L2 proficiency levels. The learners in this study modified their writing from the very onset of language study and the overall size and range of the modification system did not significantly change over four semesters. However, its composition changed continuously and reflected non-linear waxing and waning of different modifier categories. The study confirmed some results from previous cross-sectional research showing that interlanguage development is characterized by a decrease in cognitively easier (e.g., uninflected) categories and an increase in cognitively more difficult (e.g., inflected and clausal) categories. The high variability that was found along with uniform group trends demonstrates the necessity of simultaneous investigations of linguistic development in groups and individuals

KU ScholarWorks

A SYSTEMATIC REVIEW AND META-ANALYSIS OF CRITERION-RELATED VALIDITY IN EARLY MATHEMATICS CURRICULUM-BASED MEASUREMENT

Author: McCulloch Leigh M.
Publication venue: 'East Carolina University'
Publication date: 01/01/2010
Field of study

Research interest in early mathematics curriculum-based measurement (EM-CBM) has increased substantially throughout the course of the past decade. There has been a significant increase in the number of published studies regarding the validation of EM-CBM. Currently, however, there is no quantification or summarization of the multitude of research studies.Â Curriculum-based measurement can be used in various ways in a school: (a) screening students, (b) monitoring progress, (c) identifying student strengths and weaknesses, and (d) predicting student performance on standardized assessments. Mathematics criterion measures are standardized, norm-referenced, individually administered tests. The purpose of administering mathematics criterion measures in the studies that will be synthesized in the proposed meta-analysis was to establish validity of early mathematics curriculum-based measurement. Mathematics criterion measures are also administered in order to measure an individual's math achievement level, as compared with same aged peers in a norm-referenced group. Finally, math criterion tests are necessary in order to define the construct that early mathematics CBM is purportedly measuring. A meta-analysis technique was used to quantify the predictor-criterion relationship between EM-CBM and standardized norm-referenced math achievement tests. Research databases were searched to collect all relevant publications. The articles included reported correlation coefficients between EM-CBM and norm referenced standardized achievement tests, used a clear, standardized administration and scoring criteria, administered standardized math criterion assessments concurrently with, or after, the administration of the EM-CBM, and included a sample of participants in the grades between Pre-K and 2. Correlation coefficients were obtained for each predictor-criterion relationship of interest and used as the primary units of analyses. The first hypothesis was that there would be a strong, positive correlation between the predictor and criterion measures. The results support this hypothesis and indicate that the mean correlation between early numeracy and math achievement is .49. This correlation coefficient signifies a moderate-to-strong relationship between the two variables. The second objective of this study was to examine the variables which influence the relationship between early numeracy and math achievement and determine which variables are moderators. There were six variables that were identified as moderators: correlation type, predictor skill, criterion skill, grade level, procedural integrity, and predictor category. Specifically, these six variables were qualitative variables found to influence the strength of the relationship between the predictor and criterion variables.M.A

ScholarShip

Pragmatics & Language Learning, Volume 12

Author: Kasper Gabriele
Nguyen Hanh thi
Yoshimi Dina
Yoshioka Jim
Publication venue: National Foreign Language Resource Center
Publication date: 01/01/2010
Field of study

Pragmatics & Language Learning Volume 12 examines the organization of second language and multilingual speakers’ talk and pragmatic knowledge across a range of naturalistic and experimental activities. Based on data collected on Danish, English, Hawaiʻi Creole, Indonesian, and Japanese as target languages, the contributions explore the nexus of pragmatic knowledge, interaction, and L2 learning outside and inside of educational settings. Pragmatics & Language Learning (“PLL”), a refereed series sponsored by the National Foreign Language Resource Center at the University of Hawaiʻi, publishes selected papers from the biennial Conference on International Pragmatics & Language Learning under the editorship of the conference hosts and the series editor, Gabriele Kasper

ScholarSpace at University of Hawai'i at Manoa

An analysis of 2005 NAEP 8th grade mathematics achievement items by content strand, problem type and language complexity

Author: Fagan Yvette Marie
Publication venue: USF Scholarship: a digital repository @ Gleeson Library | Geschke Center
Publication date: 01/01/2007
Field of study

unavailabl

CiteSeerX

University of San Francisco

Phonological features of Hong Kong English : patterns of variation and effects on local acceptability

Author: SEWELL Andrew John
Publication venue: Digital Commons @ Lingnan University
Publication date: 01/01/2010
Field of study

The changing dynamics of international communication in English have led to a intense questioning of the relevance of native-speaker pronunciation models in language teaching and testing. In addition, the World Englishes approach to local varieties has increased their level of recognition. Both of these developments suggest that English pronunciation models need to be reviewed, and Hong Kong represents an interesting case study. Although it has been claimed that Hong Kong English is at the ‘nativization’ stage, the existence of exonormative attitudes towards English is also well known. Two important questions arise from this inherent tension, neither of which has been intensively addressed in previous studies. Firstly, although many of the features of Hong Kong English pronunciation have been described, patterns of inter-speaker variation have not been investigated in detail. Secondly, the attitudes of Hong Kong English users towards the phonological features of their own variety have not been studied in ways that take account of such variation. This dissertation addresses both of these questions by being features-based in approach and using local listeners to evaluate accent samples. After an initial review of the features of Hong Kong English pronunciation, a preliminary study surveys the occurrence of consonantal phonological features within a mini-corpus of speech samples taken from local television programmes. Its findings are presented in the form of an implicational scale, which not only shows the relative frequencies with which different features occurred, but also indicates the existence of implicational patterns of co-occurrence. In the main study, twelve authentic accent samples (eleven Hong Kong speakers and one British speaker) were presented to 52 first-year undergraduate students for evaluation as to their acceptability, defined here as acceptability for pedagogical purposes. Multivariate statistical analysis discovered firstly that phonological ‘errors’, as marked by the student listeners, were the most important measured factor in determining the acceptability scores, and secondly that only certain types of ‘error’ or ‘feature’ had significant effects. These features were either related to L1 transfer or involved other salient phenomena such as idiosyncratic alterations to syllable structure. The explanatory part of the study includes acceptability as one of the factors determining feature persistence, in an ‘ecological’ or ‘evolutionary’ model of L2 phonology acquisition and development that combines the findings of the preliminary and main studies. Among the other factors that determine feature persistence or disappearance, salience, intelligibility and markedness are invoked as important influences. The acceptability data also has pedagogical implications, in that local listeners did not give the British accent the highest acceptability rating. This contrasts with the findings of previous studies regarding the pedagogical acceptability of the Hong Kong English accent. However, the features-based approach indicates that only certain types of local accent were acceptable to these listeners, and that these accents were more, rather than less, ‘native-like’. In various ways, the study contributes to an understanding of accent variation and acceptability within a new variety of English

Digital Commons @ Lingnan University

What effect does short term Study Abroad (SA) have on learners’ vocabulary knowledge?

Author: THOMAS CATON
Publication venue
Publication date: 01/01/2023
Field of study

This thesis describes a study which tracks longitudinal changes in vocabularyknowledge during a short-term Study Abroad (SA) experience. A test ofproductive vocabulary knowledge, Lex30 (Meara & Fitzpatrick, 2000),requiring the production of word association responses, is used to elicit vocabulary from 38 Japanese L1 learners of English at four test times at equal intervals before and after an SA experience. The study starts by investigating whether there are changes in both the total number of words and in the number of less frequently occurring words produced by SA participants. Three additional ways of measuring the development of lexical knowledge over time are then proposed. The first examines changes in the ability of participants of different proficiency levels in producing collocates in response to Lex30 cue words. The second tracks changes in spelling accuracy to measure if improvements take place over time. The third analysis uses an online measuring instrument (Wmatrix; Rayson, 2009) to explore if there are any changes in the mastery of specific semantic domains. The results show that there is significant growth in the productive use of less frequent vocabulary knowledge during the SA period. There is also an increase in collocation production with lower proficiency participants and evidence of some improvement in the way certain vocabulary items are spelled. The tendency for SA learners to produce more words from semantic groups related to SA experiences is also demonstrated. Post-SA tests show that while some knowledge attrition occurs it does not decline to pre-SA levels. The studyshows how short-term SA programmes can be evaluated using a word association test, contributing to a better understanding of how vocabularydevelops during intensive language learning experiences. It also demonstrates the gradual shift of productive vocabulary knowledge from partial word knowledge to a more complete state of productive mastery

Cronfa at Swansea University

Predicting Academic Success of Entering Freshmen at an Urban University Through the Assessment of Oral and Written Language Competency

Author: Cobbs Karen D.
Publication venue: ODU Digital Commons
Publication date: 01/04/1998
Field of study

In Moores and Klas\u27 (1989) definitive study on college student retention, postsecondary administrators ranked the maintenance of student enrollment second in importance on a list of twenty critical issues facing higher education. Of particular relevance to college administrators has been the retention and graduation of African-American college students (D. B. Hawkins, 1994; Western Reserve, 1991). Researchers, in considering the overall problem of student attrition, particularly, among African-Americans, have explored such questions as these: Which students are dropping out (Sherman, Giles and Green, 1994; Robinson, 1992)? Why do they discontinue their studies (Austin, 1982; Bohr et al., 1995; Kraft, 1992; Tinto, 1975)? Why is the problem especially serious among African-American students (Ball, 1992; Carris, 1995; Miller, 1990)? Are the traditional prediction and placement measures failing to accurately identify those entering freshmen students with the potential to succeed and those who may require intervention to succeed (Bridgeman & Wendler, 1991; Cole, 1987; Wambach & Brothern, 1989)? If so, are there ways to improve on the process? Would using an alternative or supplementary measure more effectively predict which college students are likely to succeed and which students are likely to succeed in college with intervention? The majority of colleges utilize prediction measures such as the Scholastic Assessment Test (SAT), the American College Testing Program (ACT) and high school grade point average (HGPA); and, placement measures such as the Nelson Denny Reading Test, the Degrees of Reading Power (DRP) test and writing essays to determine the potential for academic success among freshmen entrants (Lederman et al., 1986; N. V. Wood, 1989). An investigation of the effectiveness of using an alternative language-based measure (that assesses a freshman\u27s speaking, listening, reading and writing skills), the Test of Adolescent and Adult Language (TOAL-3), for predicting academic success and assuring a fairer evaluation process and greater precision in the identification and placement of entering freshmen was the focus of this proposed study. Interestingly, colleges have traditionally ignored a student\u27s level of communication competence (e.g., speaking, listening, reading and writing) in predicting academic achievement (Rubin & Graham, 1988). The academic performance of African-American freshmen constituted a sub-theme, suggested by the higher dropout rates found among this population (Minorities in Higher Education, 1994). This study found that there was no statistically significant difference in the ability of the TOAL-3, when compared to the SAT, DRP and WSPT, to predict first semester grade point average (FGPA) based on language competency, among entering freshmen students in general. However, there was a statistically significant difference between the TOAL-3 and the WSPT in identifying entering freshmen students as either Predicted Success (PS) or Potential Difficulty (PD). There was a statistically significant difference between the TOAL-3 and the SAT as a function of race and gender in identifying freshmen students as either Predicted Success (PS) or Potential Difficulty (PD). There was also a statistically significant difference between the TOAL-3 and the WSPT, in forecasting which freshmen students identified as Predicted Success (PS) would achieve the criterion variable as a function of gender. However, because of the small sample size, caution should be utilized in interpreting these findings

Old Dominion University

Mediated discourse at the European Parliament: Empirical investigations

Author: Marta Kajzer-Wietrzny Adriano Ferraresi, Ilmari Ivaska, Silvia Bernardini
Publication venue: place:Berlin
Publication date: 01/01/2022
Field of study

The purpose of this book is to showcase a diverse set of directions in empirical research on mediated discourse, reflecting on the state-of-the-art and the increasing intersection between Corpus-based Interpreting Studies (CBIS) and Corpus-based Translation Studies (CBTS). Undeniably, data from the European Parliament (EP) offer a great opportunity for such research. Not only does the institution provide a sizeable sample of oral debates held at the EP together with their simultaneous interpretations into all languages of the European Union. It also makes available written verbatim reports of the original speeches, which used to be translated. From a methodological perspective, EP materials thus guarantee a great degree of homogeneity, which is particularly valuable in corpus studies, where data comparability is frequently a challenge. In this volume, progress is visible in both CBIS and CBTS. In interpreting, it manifests itself notably in the availability of comprehensive transcription, annotation and alignment systems. In translation, datasets are becoming substantially richer in metadata, which allow for increasingly refined multi-factorial analysis. At the crossroads between the two fields, intermodal investigations bring to the fore what these mediation modes have in common and how they differ. The volume is thus aimed in particular at Interpreting and Translation scholars looking for new descriptive insights and methodological approaches in the investigation of mediated discourse, but it may be also of interest for (corpus) linguists analysing parliamentary discourse in general

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna