Search CORE

4,145 research outputs found

Corpus Stylistics and Henry James’s Syntax

Author: Moss L
Publication venue: UCL (University College London)
Publication date: 28/01/2015
Field of study

The starting point of this dissertation is a methodological question: how can corpus stylistics be used to analyse the syntax of literary fiction? A comparison of the syntax of Henry James’s late style in The Golden Bowl (1904) and his early style in Washington Square (1881) was used as a case study. While James’s late style is very widely discussed by literary critics and often seen as ‘difficult’, there has been very little evidence offered to substantiate this description. Within the extensive field of Henry James studies, there have been few linguistic descriptions of James’s prose. To remedy this, I compiled The Henry James Parsed Corpus (HJPC) from five chapters from each of the two novels. My analysis of the corpus showed that The Golden Bowl is more syntactically complex than Washington Square in a number of ways but only in sentences which do not contain direct speech. James’s idiosyncratic use of parenthesis was defined precisely using syntactic criteria and named delay. The Golden Bowl has more delay than Washington Square but also only in non-speech sentences. Only a small number of sentences have very high numbers of dependent clauses and/or delay. I argue that these exceptional sentences create the impression that the later text is homogeneously difficult. My research shows that this impression is deceptive; in fact the overwhelming majority of sentences in The Golden Bowl are no more syntactically complex than those of Washington Square. A secondary use of the HJPC is to assist close reading. Chapter outlines of the central chapter of each novel were generated and were found to mirror plot developments and dialogue sections. Salient sentences highlighted many key moments in the plot, or revealed aspects of characters’ personalities

UCL Discovery

Seeking the unseen humanities macrostructures: The use of corpus- and genre-assisted research methodologies to analyze written norms in English and Spanish literary criticism articles

Author: Lake William
Publication venue: ScholarWorks @ Georgia State University
Publication date: 11/08/2020
Field of study

Descriptive studies of general and discipline-specific academic writing genre conventions have paved the way for pedagogical materials that build real-world skills for novice academic writers. To name some better-known cases, breakthroughs have taken place in this regard in the fields of psychology, engineering, and chemistry. However, attested scholarship on rhetorical patterns in humanities writing, such as published literary criticism (hereafter “LC”) is less common. This dearth of research affects scholars of literature produced by Spanish-speakers who write in both English and Spanish. Many L1 Spanish user scholars must often publish their research in English, rather than Spanish, to maintain institutional employment. Postsecondary Spanish majors in the U.S. must also demonstrate competence in literary criticism to gain credentials. To address the needs of these groups, the present study examines the potential of lexical bundles, qualitative content, and multidimensional analyses to help describe LC from a lexico-grammatical perspective. Such findings may facilitate an arrival at a comprehensive schematic of strategies used by expert-level literary scholars in Spanish and English. First, using multidimensional analysis, linguistic features characteristic of literary criticism writing are analyzed and interpreted in the context of prior multidimensional analyses to offer insight on ways in which the written norms of LC compare to those espoused in other genres previously analyzed. Next, the study examines the syntactic structures and functions of lexical bundles used in English and Spanish LC writing, with particular attention to quasi-equivalent and language-specific bundles. Finally, the study proposes a taxonomy of communicative strategies utilized by literary scholars in their arguments. Devised via qualitative content analysis, this taxonomy may extend the functional analysis of bundles in LC. These findings offer further insight into the macrostructures of literary criticism, as well as the sentence-level strategies that serve as building blocks for expert-level writing in the genre

ScholarWorks @ Georgia State University

On the importance of audio material in spoken linguistics : A case study of the London–Lund Corpus 2

Author: Johansson Victoria
Paradis Carita
Pöldvere Nele
Publication venue
Publication date: 01/01/2021
Field of study

Lund University Publications

Language and Linguistics in a Complex World Data, Interdisciplinarity, Transfer, and the Next Generation. ICAME41 Extended Book of Abstracts

Author: Busse Beatrix
Dumrukcic Nina
Möhlig-Falke Ruth
Publication venue
Publication date: 01/01/2021
Field of study

This is a collection of papers, work-in-progress reports, and other contributions that were part of the ICAME41 digital conference

Kölner UniversitätsPublikationsServer

Language and Linguistics in a Complex World Data, Interdisciplinarity, Transfer, and the Next Generation. ICAME41 Extended Book of Abstracts

Author: Busse Beatrix
Dumrukcic Nina
Möhlig-Falke Ruth
Publication venue
Publication date: 01/01/2021
Field of study

This is a collection of papers, work-in-progress reports, and other contributions that were part of the ICAME41 digital conference

Kölner UniversitätsPublikationsServer

A Quantitative Corpus-based Analysis of Linking Adverbials in Students’ Academic Writing

Author: Wodarczyk Łukasz
Publication venue: 'Uniwersytet Lodzki (University of Lodz)'
Publication date: 01/01/2013
Field of study

Udostępnienie publikacji Wydawnictwa Uniwersytetu Łódzkiego finansowane w ramach projektu „Doskonałość naukowa kluczem do doskonałości kształcenia”. Projekt realizowany jest ze środków Europejskiego Funduszu Społecznego w ramach Programu Operacyjnego Wiedza Edukacja Rozwój; nr umowy: POWER.03.05.00-00-Z092/17-00

Crossref

Repozytorium Uniwersytetu Łódzkiego (University of Lodz Repository)

You had me at hello: How phrasing affects memorability

Author: Cheng Justin
Danescu-Niculescu-Mizil Cristian
Kleinberg Jon
Lee Lillian
Publication venue
Publication date: 01/01/2012
Field of study

Understanding the ways in which information achieves widespread public awareness is a research question of significant interest. We consider whether, and how, the way in which the information is phrased --- the choice of words and sentence structure --- can affect this process. To this end, we develop an analysis framework and build a corpus of movie quotes, annotated with memorability information, in which we are able to control for both the speaker and the setting of the quotes. We find that there are significant differences between memorable and non-memorable quotes in several key dimensions, even after controlling for situational and contextual factors. One is lexical distinctiveness: in aggregate, memorable quotes use less common word choices, but at the same time are built upon a scaffolding of common syntactic patterns. Another is that memorable quotes tend to be more general in ways that make them easy to apply in new contexts --- that is, more portable. We also show how the concept of "memorable language" can be extended across domains.Comment: Final version of paper to appear at ACL 2012. 10pp, 1 fig. Data, demo memorability test and other info available at http://www.cs.cornell.edu/~cristian/memorability.htm

arXiv.org e-Print Archive

CiteSeerX

Who’s Blogging Now? Linguistic Features and Authorship Analysis in Sports Blogs

Author
Publication venue
Publication date: 01/01/2017
Field of study

abstract: The field of authorship determination, previously largely falling under the umbrella of literary analysis but recently becoming a large subfield of forensic linguistics, has grown substantially over the last two decades. As its body of research and its record of successful forensic application continue to grow, this growth is paralleled by the demand for its application. However, methods which have undergone rigorous testing to show their reliability and replicability, allowing them to meet the strict Daubert criteria put forth by the US court system, have not truly been established. In this study, I set out to investigate how a list of parameters, many commonly used in the methodologies of previous researchers, would perform when used to test documents of bloggers from a sports blog, Winging It in Motown. Three prolific bloggers were chosen from the site, and a corpus of posts was created for each blogger which was then examined for each of the chosen parameters. One test document for each of the three bloggers which was not included in that blogger’s corpus was then chosen from the blog page, and these documents were examined for each of the parameters via the same methodologies as were used to examine the corpora. Once data for the corpora and all three test documents was obtained, the results were compared for similarity, and an author determination was made for each test document along each parameter. The findings indicated that overall the parameters were quite unsuccessful in determining authorship for these test documents based on the author corpora developed for the study. Only two parameters successfully identified the authors of the test documents at a rate higher than chance, and the possibility exists that other factors may be driving these successful identifications, demanding further research to confirm their validity as parameters for the purpose of authorship work.Dissertation/ThesisDoctoral Dissertation English 201

ASU Digital Repository

The Gutenberg English Poetry Corpus: Exemplary Quantitative Narrative Analyses

Author: Andrzejewski
Aryani
Aryani
Baroni
Bird
Bohrn
Bornet
Braun
Brysbaert
Burrows
Clements
Deerwester
Frank
Ganascia
Geurts
Hanauer
Jacobs
Jacobs
Jacobs
Jacobs
Jacobs
Jacobs
Jacobs
Jacobs
Jacobs
Jacobs
Jacobs
Jacobs
Jakobson
Jurafsky
Katz
Leech
Michel
Mitchell
Moretti
Nicklas
O’Sullivan
Pedregosa
Roe
Schmidtke
Schmidtke
Schrott
Simonton
Simonton
Stamatatos
Stenneken
Steyvers
Stockwell
Tsur
Turner
Turney
Ullrich
van den Hoven
van Halteren
Vendler
Westbury
Willems
Ziegler
Ziegler
Zipf
Publication venue
Publication date: 01/01/2018
Field of study

This paper describes a corpus of about 3,000 English literary texts with about 250 million words extracted from the Gutenberg project that span a range of genres from both fiction and non-fiction written by more than 130 authors (e.g., Darwin, Dickens, Shakespeare). Quantitative narrative analysis (QNA) is used to explore a cleaned subcorpus, the Gutenberg English Poetry Corpus (GEPC), which comprises over 100 poetic texts with around two million words from about 50 authors (e.g., Keats, Joyce, Wordsworth). Some exemplary QNA studies show author similarities based on latent semantic analysis, significant topics for each author or various text-analytic metrics for George Eliot’s poem “How Lisa Loved the King” and James Joyce’s “Chamber Music,” concerning, e.g., lexical diversity or sentiment analysis. The GEPC is particularly suited for research in Digital Humanities, Computational Stylistics, or Neurocognitive Poetics, e.g., as training and test corpus for stimulus development and control in empirical studies

Institutional Repository of the Freie Universität Berlin

Crossref