4,145 research outputs found
Corpus Stylistics and Henry James’s Syntax
The starting point of this dissertation is a methodological question: how can corpus stylistics be used to analyse the syntax of literary fiction? A comparison of the syntax of Henry James’s late style in The Golden Bowl (1904) and his early style in Washington Square (1881) was used as a case study. While James’s late style is very widely discussed by literary critics and often seen as ‘difficult’, there has been very little evidence offered to substantiate this description. Within the extensive field of Henry James studies, there have been few linguistic descriptions of James’s prose. To remedy this, I compiled The Henry James Parsed Corpus (HJPC) from five chapters from each of the two novels. My analysis of the corpus showed that The Golden Bowl is more syntactically complex than Washington Square in a number of ways but only in sentences which do not contain direct speech. James’s idiosyncratic use of parenthesis was defined precisely using syntactic criteria and named delay. The Golden Bowl has more delay than Washington Square but also only in non-speech sentences. Only a small number of sentences have very high numbers of dependent clauses and/or delay. I argue that these exceptional sentences create the impression that the later text is homogeneously difficult. My research shows that this impression is deceptive; in fact the overwhelming majority of sentences in The Golden Bowl are no more syntactically complex than those of Washington Square. A secondary use of the HJPC is to assist close reading. Chapter outlines of the central chapter of each novel were generated and were found to mirror plot developments and dialogue sections. Salient sentences highlighted many key moments in the plot, or revealed aspects of characters’ personalities
Seeking the unseen humanities macrostructures: The use of corpus- and genre-assisted research methodologies to analyze written norms in English and Spanish literary criticism articles
Descriptive studies of general and discipline-specific academic writing genre conventions have paved the way for pedagogical materials that build real-world skills for novice academic writers. To name some better-known cases, breakthroughs have taken place in this regard in the fields of psychology, engineering, and chemistry. However, attested scholarship on rhetorical patterns in humanities writing, such as published literary criticism (hereafter “LC”) is less common. This dearth of research affects scholars of literature produced by Spanish-speakers who write in both English and Spanish. Many L1 Spanish user scholars must often publish their research in English, rather than Spanish, to maintain institutional employment. Postsecondary Spanish majors in the U.S. must also demonstrate competence in literary criticism to gain credentials. To address the needs of these groups, the present study examines the potential of lexical bundles, qualitative content, and multidimensional analyses to help describe LC from a lexico-grammatical perspective. Such findings may facilitate an arrival at a comprehensive schematic of strategies used by expert-level literary scholars in Spanish and English. First, using multidimensional analysis, linguistic features characteristic of literary criticism writing are analyzed and interpreted in the context of prior multidimensional analyses to offer insight on ways in which the written norms of LC compare to those espoused in other genres previously analyzed. Next, the study examines the syntactic structures and functions of lexical bundles used in English and Spanish LC writing, with particular attention to quasi-equivalent and language-specific bundles. Finally, the study proposes a taxonomy of communicative strategies utilized by literary scholars in their arguments. Devised via qualitative content analysis, this taxonomy may extend the functional analysis of bundles in LC. These findings offer further insight into the macrostructures of literary criticism, as well as the sentence-level strategies that serve as building blocks for expert-level writing in the genre
On the importance of audio material in spoken linguistics : A case study of the London–Lund Corpus 2
Language and Linguistics in a Complex World Data, Interdisciplinarity, Transfer, and the Next Generation. ICAME41 Extended Book of Abstracts
This is a collection of papers, work-in-progress reports, and other contributions that were part of the ICAME41 digital conference
Language and Linguistics in a Complex World Data, Interdisciplinarity, Transfer, and the Next Generation. ICAME41 Extended Book of Abstracts
This is a collection of papers, work-in-progress reports, and other contributions that were part of the ICAME41 digital conference
A Quantitative Corpus-based Analysis of Linking Adverbials in Students’ Academic Writing
Udostępnienie publikacji Wydawnictwa Uniwersytetu Łódzkiego finansowane w ramach projektu „Doskonałość naukowa kluczem do doskonałości kształcenia”. Projekt realizowany jest ze środków Europejskiego Funduszu Społecznego w ramach Programu Operacyjnego Wiedza Edukacja Rozwój; nr umowy: POWER.03.05.00-00-Z092/17-00
You had me at hello: How phrasing affects memorability
Understanding the ways in which information achieves widespread public
awareness is a research question of significant interest. We consider whether,
and how, the way in which the information is phrased --- the choice of words
and sentence structure --- can affect this process. To this end, we develop an
analysis framework and build a corpus of movie quotes, annotated with
memorability information, in which we are able to control for both the speaker
and the setting of the quotes. We find that there are significant differences
between memorable and non-memorable quotes in several key dimensions, even
after controlling for situational and contextual factors. One is lexical
distinctiveness: in aggregate, memorable quotes use less common word choices,
but at the same time are built upon a scaffolding of common syntactic patterns.
Another is that memorable quotes tend to be more general in ways that make them
easy to apply in new contexts --- that is, more portable. We also show how the
concept of "memorable language" can be extended across domains.Comment: Final version of paper to appear at ACL 2012. 10pp, 1 fig. Data, demo
memorability test and other info available at
http://www.cs.cornell.edu/~cristian/memorability.htm
Who’s Blogging Now? Linguistic Features and Authorship Analysis in Sports Blogs
abstract: The field of authorship determination, previously largely falling under the umbrella of literary analysis but recently becoming a large subfield of forensic linguistics, has grown substantially over the last two decades. As its body of research and its record of successful forensic application continue to grow, this growth is paralleled by the demand for its application. However, methods which have undergone rigorous testing to show their reliability and replicability, allowing them to meet the strict Daubert criteria put forth by the US court system, have not truly been established.
In this study, I set out to investigate how a list of parameters, many commonly used in the methodologies of previous researchers, would perform when used to test documents of bloggers from a sports blog, Winging It in Motown. Three prolific bloggers were chosen from the site, and a corpus of posts was created for each blogger which was then examined for each of the chosen parameters. One test document for each of the three bloggers which was not included in that blogger’s corpus was then chosen from the blog page, and these documents were examined for each of the parameters via the same methodologies as were used to examine the corpora. Once data for the corpora and all three test documents was obtained, the results were compared for similarity, and an author determination was made for each test document along each parameter.
The findings indicated that overall the parameters were quite unsuccessful in determining authorship for these test documents based on the author corpora developed for the study. Only two parameters successfully identified the authors of the test documents at a rate higher than chance, and the possibility exists that other factors may be driving these successful identifications, demanding further research to confirm their validity as parameters for the purpose of authorship work.Dissertation/ThesisDoctoral Dissertation English 201
The Gutenberg English Poetry Corpus: Exemplary Quantitative Narrative Analyses
This paper describes a corpus of about 3,000 English literary texts with about
250 million words extracted from the Gutenberg project that span a range of
genres from both fiction and non-fiction written by more than 130 authors
(e.g., Darwin, Dickens, Shakespeare). Quantitative narrative analysis (QNA) is
used to explore a cleaned subcorpus, the Gutenberg English Poetry Corpus
(GEPC), which comprises over 100 poetic texts with around two million words
from about 50 authors (e.g., Keats, Joyce, Wordsworth). Some exemplary QNA
studies show author similarities based on latent semantic analysis,
significant topics for each author or various text-analytic metrics for George
Eliot’s poem “How Lisa Loved the King” and James Joyce’s “Chamber Music,”
concerning, e.g., lexical diversity or sentiment analysis. The GEPC is
particularly suited for research in Digital Humanities, Computational
Stylistics, or Neurocognitive Poetics, e.g., as training and test corpus for
stimulus development and control in empirical studies
- …