146 research outputs found

    Measuring Syntactic Development in L2 Writing: Fine Grained Indices of Syntactic Complexity and Usage-Based Indices of Syntactic Sophistication

    Get PDF
    Syntactic complexity has been an area of significant interest in L2 writing development studies over the past 45 years. Despite the regularity in which syntactic complexity measures have been employed, the construct is still relatively under-developed, and, as a result, the cumulative results of syntactic complexity studies can appear opaque. At least three reasons exist for the current state of affairs, namely the lack of consistency and clarity by which indices of syntactic complexity have been described, the overly broad nature of the indices that have been regularly employed, and the omission of indices that focus on usage-based perspectives. This study seeks to address these three gaps through the development and validation of the Tool for the Automatic Assessment of Syntactic Sophistication and Complexity (TAASSC). TAASSC measures large and fined grained clausal and phrasal indices of syntactic complexity and usage-based frequency/contingency indices of syntactic sophistication. Using TAASSC, this study will address L2 writing development in two main ways: through the examination of syntactic development longitudinally and through the examination of human judgments of writing proficiency (e.g., expert ratings of TOEFL essays). This study will have important implications for second language acquisition, second language writing, and language assessment

    Span Identification of Epistemic Stance-Taking in Academic Written English

    Full text link
    Responding to the increasing need for automated writing evaluation (AWE) systems to assess language use beyond lexis and grammar (Burstein et al., 2016), we introduce a new approach to identify rhetorical features of stance in academic English writing. Drawing on the discourse-analytic framework of engagement in the Appraisal analysis (Martin & White, 2005), we manually annotated 4,688 sentences (126,411 tokens) for eight rhetorical stance categories (e.g., PROCLAIM, ATTRIBUTION) and additional discourse elements. We then report an experiment to train machine learning models to identify and categorize the spans of these stance expressions. The best-performing model (RoBERTa + LSTM) achieved macro-averaged F1 of .7208 in the span identification of stance-taking expressions, slightly outperforming the intercoder reliability estimates before adjudication (F1 = .6629).Comment: The 18th Workshop on Innovative Use of NLP for Building Educational Application

    Measuring Longitudinal Writing Development Using Indices of Syntactic Complexity and Sophistication

    Get PDF
    Measures of syntactic complexity such as mean length of T-unit have been common measures of language proficiency in studies of second language acquisition. Despite the ubiquity and usefulness of such structure-based measures, they could be complemented with measures based on usage-based theories, which focus on the development of not just syntactic forms but also form-meaning pairs, called constructions (Ellis, 2002). Recent cross-sectional research (Kyle & Crossley, 2017) has indicated that indices related to usage-based characteristics of verb argument construction (VAC) use may be better indicators of writing proficiency than structure-based indices of syntactic complexity. However, because cross-sectional studies can only show general trends across proficiency benchmarks, it is important to test these findings in individuals over time (Lowie & Verspoor, 2019). Thus, this study investigates the developmental trajectories of second language learners of English across two academic years with regard to syntactic complexity and VAC sophistication

    College Readiness as Perceived by First-Year Community College Students Taking Remedial Courses

    Get PDF
    Roughly 60% of first-year community college students attending a community college in Idaho need to take remedial courses. Such a high percentage of first-year community college students in remedial courses indicates that students are not being properly prepared for collegiate studies. The purpose of this study was to understand college readiness through the perception of first-year community college students who were taking remedial courses. The framework for this study builds on Conley\u27s multidimensional model of college readiness. Data from 10 semi structured interviews conducted with community college students taking remedial courses provided information about the opinions and ideas about college readiness, in addition to evaluations regarding what was missing in their K-12 education to prepare them for collegiate studies. Through open-ended data coding, interrelated themes were analyzed, and the interpreted meaning was shared through a qualitative narrative. The findings from this study suggest that college readiness is more than academic knowledge and understanding. The K-12 education system shall help students to focus on specific skills such as time management and note taking and to seek out their passions and goals. The findings also suggest that the K-12 education system within the United States needs to be restructured to incorporate a system that encourages and supports student success through more individualized learning that places focus on student passions. When students are given the opportunity to seek after their passions, they gain more interest and motivation to learn and build a strong sense of self-efficacy

    Applying Natural Language Processing Tools to a Student Academic Writing Corpus: How Large are Disciplinary Differences Across Science and Engineering Fields?

    Get PDF
    • Background: Researchers have been working towards better understanding differences in professional disciplinary writing (e.g., Ewer & Latorre, 1969; Hu & Cao, 2015; Hyland, 2002; Hyland & Tse, 2007) for decades. Recently, research has taken important steps towards understanding disciplinary variation in student writing. Much of this research is corpus-based and focuses on lexico-grammatical features in student writing as captured in the British Academic Written English (BAWE) corpus and the Michigan Corpus of Upper-level Student Papers (MICUSP). The present study extends this work by analyzing lexical and cohesion differences among disciplines in MICUSP. Critically, we analyze not only linguistic differences in macro-disciplines (science and engineering), but also in micro-disciplines within these macro-disciplines (biology, physics, industrial engineering, and mechanical engineering). • Literature Review: Hardy and Römer (2013) used a multidimensional analysis to investigate linguistic differences across four macro-disciplines represented in MICUSP. Durrant (2014, in press) analyzed vocabulary in texts produced by student writers in the BAWE corpus by discipline and level (year) and disciplinary differences in lexical bundles. Ward (2007) examined lexical differences within micro-disciplines of a single discipline. • Research Questions: The research questions that guide this study are as follows: 1. Are there significant lexical and cohesive differences between science and engineering student writing? 2. Are there significant lexical and cohesive differences between micro-disciplines within science and engineering student writing? • Research Methodology: To address the research questions, student-produced science and engineering texts from MICUSP were analyzed with regard to lexical sophistication and textual features of cohesion. Specifically, 22 indices of lexical sophistication calculated by the Tool for the Automatic Analysis of Lexical Sophistication (TAALES; Kyle & Crossley, 2015) and 38 cohesion indices calculated by the Tool for the Automatic Analysis of Cohesion (TAACO; Crossley, Kyle, & McNamara, 2016) were used. These features were then compared both across science and engineering texts (addressing Research Question 1) and across micro-disciplines within science and engineering (biology and physics, industrial and mechanical engineering) using discriminate function analyses (DFA). • Results: The DFAs revealed significant linguistic differences, not only between student writing in the two macro-disciplines but also between the micro-disciplines. Differences in classification accuracy based on students’ years of study hovered at about 10%. An analysis of accuracies of classification by paper type found they were similar for larger and smaller sample sizes, providing some indication that paper type was not a confounding variable in classification accuracy. • Discussion: The findings provide strong support that macro-disciplinary and micro-disciplinary differences exist in student writing in these MICUSP samples and that these differences are likely not related to student level or paper type. These findings have important implications for understanding disciplinary differences. First, they confirm previous research that found the vocabulary used by different macro-disciplines to be “strikingly diverse” (Durrant, 2015), but they also show a remarkable diversity of cohesion features. The findings suggest that the common understanding of the STEM disciplines as “close” bears reconsideration in linguistic terms. Second, the lexical and cohesion differences between micro-disciplines are large enough and consistent enough to suggest that each micro-discipline can be thought of as containing a unique linguistic profile of features. Third, the differences discerned in the NLP analysis are evident at least as early as the final year of undergraduate study, suggesting that students at this level already have a solid understanding of the conventions of the disciplines of which they are aspiring to be members. Moreover, the differences are relatively homogeneous across levels, which confirms findings by Durrant (2015) but, importantly, extends these findings to include cohesion markers. • Conclusions: The findings from this study provide evidence that macro-disciplinary and micro-disciplinary differences at the linguistic level exist in student writing, not only in lexical use but also in text cohesion. A number of pedagogical applications of writing analytics are proposed based on the reported findings from TAALES and TAACO. Further studies using different corpora (e.g., BAWE) or purpose assembled corpora are suggested to address limitations in the size and range of text types found within MICUSP. This study also points the way toward studies of disciplinary differences using NLP approaches that capture data which goes beyond the lexical and cohesive features of text, including the use of part-of-speech tags, syntactic parsing, indices related to syntactic complexity and similarity, rhetorical features, or more advanced cohesion metrics (latent semantic analysis, latent Dirichlet allocation, Word2Vec approaches)

    Unemployment, negative equity, and strategic default

    Full text link
    Using new household-level data, we quantitatively assess the roles that job loss, negative equity, and wealth (including unsecured debt, liquid assets, and illiquid assets) play in default decisions. In sharp contrast to prior studies that proxy for individual unemployment status using regional unemployment rates, we find that individual unemployment is the strongest predictor of default. We find that individual unemployment increases the probability of default by 5 - 13 percentage points, ceteris paribus, compared with the sample average default rate of 3.9 percent. We also find that only 13.9 percent of defaulters have both negative equity and enough liquid or illiquid assets to make one month's mortgage payment. This finding suggests that "ruthless" or "strategic" default during the 2007 - 09 recession was relatively rare and that policies designed to promote employment, such as payroll tax cuts, are most likely to stem defaults in the long run rather than policies that temporarily modify mortgages

    Can't pay or won't pay? Unemployment, negative equity, and strategic default

    Full text link
    Prior research has found that job loss, as proxied for by regional unemployment rates, is a weak predictor of mortgage default. In contrast, using micro data from the PSID, this paper finds that job loss and adverse financial shocks are important determinants of mortgage default. Households with an unemployed head are approximately three times as likely to default as households with an employed head. Similarly, households that experience divorce, report large outstanding medical expenses, or have had any other severe income loss are much more likely to default. While household-level employment and financial shocks are important drivers of mortgage default, our analysis shows that the vast majority of financially distressed households do not default. More than 80 percent of unemployed households with less than one month of mortgage payment in savings are current on their payments. We argue that this has important implications for theoretical models of mortgage default as well as for loss mitigation policies. Finally, this paper provides some of the first direct evidence on the extent of strategic default. Wealth data suggest a limited scope for strategic default, with only one-third of underwater defaulters having enough liquid assets to cover one month´s mortgage payment

    Performing a High-Throughput Virtual Screening (HVTS) to identify potential therapeutic targets of YB-1 protein

    Get PDF
    Background: Hepatocellular carcinomas (HCCs) is a primary malignancy of the liver. Hispanic-Texans have several risk factors and disparities that compound the risk of HCC diagnosis and treatment. The most used chemotherapeutic drug against HCC is sorafenib, but many liver cancers have developed a resistance to this drug. The knockdown of Y-box binding protein-1 (YB-1) has been shown to greatly increase sensitivity to sorafenib. In this study, we will discuss identification of potential YB-1 inhibitors, which can lead to re-sensitization of liver cancer cells to sorafenib. Methodology: The RCSB protein data bank (pdb) was used to retrieve the crystal structure of YB1, while the DrugBank database was used to obtain a list of experimental and approved drugs. A multiple sequence alignment (MSA) of YB-1 & Lin28 was done by Clustal Omega. Biovia Discovery Studio 2020 was used to visualize 3D models and perform a High-Throughput Virtual Screening (HTVS), which includes rigid docking via the LibDock extension, flexible docking via the CDocker extension, and a pharmacokinetic profiling via an ADMET analysis. Results: The cold shock domain of YB-1 was found to be conserved with Lin28, as a known transcription factor. 22 drug candidates were identified through HTVS. The best six show a decent binding ability in both rigid and flexible dockings and have been previously tested in different cancer types to some extent. Conclusion: We were able to identify six potential drug candidates for inhibiting our protein of interest, YB-1. Studies are in progress to study them on sorafenib-resistant HCC cell lines

    Effects of Feeding Fluorescent Brightener 28 and Blue Dextran to European Corn Borer Larvae

    Get PDF
    The European corn borer (ECB), Ostrinia nubilalis, is a common pest for corn crops in the United States and most of Europe (Hudan and LeRoux, 1986). Use of traditional pesticides to control ECB has resulted in the development of resistance in pest populations and significant loss of important biological control species. As such, novel methods, such as use of RNA interference (RNAi), are necessary to overcome resistance to traditional pesticides and protect non-target insects. RNAi takes advantage of intrinsic pathways that use long double-stranded RNAs (dsRNAs) to suppress the expression of specific genes (Zhang et al., 2010), however, many insects do not produce an efficient RNAi response, at least partially as a result of the presence of double-stranded ribonucleases (dsRNases) that degrade the dsRNAs prior to incorporation into the RNAi pathway (Kim et al., 2015). These dsRNases are present in the guts of many species, including ECB and are a powerful factor limiting the efficacy of RNAi. We hypothesize that applying dsRNA when expression of dsRNases is low (such as in young ECB larvae) and degrading the gut lining will minimize contact between dsRNAs and dsRNases and increase RNAi efficiency. Fluorescent Brightener 28 (FB28) is a chemical that has been used previously to damage insect gut linings and so is a good candidate for performing these experiments (Zhang et al., 2010). In addition, blue dextran (BD) is also necessary as a marker to demonstrate successful weakening and increased permeability of the gut, however, it is still unclear what concentrations of these chemicals ECB larvae will tolerate without significant adverse effects. Accordingly, these experiments were designed to identify the optimum levels of FB28 and BD needed in the diet of larval ECB for clear visualization of gut disruption
    • …
    corecore