Search CORE

23 research outputs found

Vergleich als Methode? Zur Empirisierung eines philologischen Verfahrens im Zeitalter der Digital Humanities [Abstract]

Author: Klimek Sonja
Müller Ralph
Publication venue: JLT Artikel
Publication date: 28/08/2015
Field of study

Journal of Literary Theory (JLT)

Finn’s Hotel and the Joycean Canon

Author: O'Sullivan James
Publication venue: Department of Applied Linguistics, Translators and Interpreters, University of Antwerp
Publication date: 25/07/2017
Field of study

Cork Open Research Archive

Finn’s Hotel and the Joycean Canon

Author: James O&#039
Publication venue: 'Modern Language Association'
Publication date: 01/01/2014
Field of study

Initially, I conduct a stylometric analysis of Dubliners, A Portrait of the Artist as a Young Man, Ulysses, Finnegans Wake, and Finn’s Hotel, using the relative frequencies of the 100 most frequent words in each text to form an authorial signature. In doing so, I hope to demonstrate whether the collection is, from the perspective of style, quite distinct, or alternatively, closely aligned to Finnegans Wake. If style can be considered a determinant of what makes a text, then I believe that the results of such an analysis should be accepted as an indicator of whether Joyce intended Finn’s Hotel to be a standalone publication, or whether the relevant manuscripts are indeed the earliest incarnations of what would eventually come to be Finnegans Wake

Humanities Commons

Investigating formulaic language as a marker of Authorship

Author: Larner Samuel
Publication venue
Publication date: 01/01/2012
Field of study

CLoK

The Secret to Popular Chinese Web Novels: A Corpus-Driven Study

Author: Hsieh Shu-Kai
Lin Yi-Ju
Publication venue: OASIcs - OpenAccess Series in Informatics. 2nd Conference on Language, Data and Knowledge (LDK 2019)
Publication date: 01/01/2019
Field of study

What is the secret to writing popular novels? The issue is an intriguing one among researchers from various fields. The goal of this study is to identify the linguistic features of several popular web novels as well as how the textual features found within and the overall tone interact with the genre and themes of each novel. Apart from writing style, non-textual information may also reveal details behind the success of web novels. Since web fiction has become a major industry with top writers making millions of dollars and their stories adapted into published books, determining essential elements of "publishable" novels is of importance. The present study further examines how non-textual information, namely, the number of hits, shares, favorites, and comments, may contribute to several features of the most popular published and unpublished web novels. Findings reveal that keywords, function words, and lexical diversity of a novel are highly related to its genres and writing style while dialogue proportion shows the narration voice of the story. In addition, relatively shorter sentences are found in these novels. The data also reveal that the number of favorites and comments serve as significant predictors for the number of shares and hits of unpublished web novels, respectively; however, the number of hits and shares of published web novels is more unpredictable

Dagstuhl Research Online Publication Server

Recommended from our members

Identifying idiolect in forensic authorship attribution: an n-gram textbite approach

Author: Johnson A
Wright D
Publication venue: Faculdade de Letras da Universidade do Porto
Publication date: 01/01/2014
Field of study

Forensic authorship attribution is concerned with identifying authors of disputed or anonymous documents, which are potentially evidential in legal cases, through the analysis of linguistic clues left behind by writers. The forensic linguist “approaches this problem of questioned authorship from the theoretical position that every native speaker has their own distinct and individual version of the language [. . . ], their own idiolect” (Coulthard, 2004: 31). However, given the diXculty in empirically substantiating a theory of idiolect, there is growing concern in the Veld that it remains too abstract to be of practical use (Kredens, 2002; Grant, 2010; Turell, 2010). Stylistic, corpus, and computational approaches to text, however, are able to identify repeated collocational patterns, or n-grams, two to six word chunks of language, similar to the popular notion of soundbites: small segments of no more than a few seconds of speech that journalists are able to recognise as having news value and which characterise the important moments of talk. The soundbite oUers an intriguing parallel for authorship attribution studies, with the following question arising: looking at any set of texts by any author, is it possible to identify ‘n-gram textbites’, small textual segments that characterise that author’s writing, providing DNA-like chunks of identifying material

Nottingham Trent Institutional Repository (IRep)

White Rose Research Online

Translating English verbal collocations into Spanish: On distribution and other relevant differences related to diatopic variation

Author: Corpas Pastor Gloria
Publication venue: 'John Benjamins Publishing Company'
Publication date: 01/01/2015
Field of study

Language varieties should be taken into account in order to enhance fluency and naturalness of translated texts. In this paper we will examine the collocational verbal range for prima-facie translation equivalents of words like decision and dilemma, which in both languages denote the act or process of reaching a resolution after consideration, resolving a question or deciding something. We will be mainly concerned with diatopic variation in Spanish. To this end, we set out to develop a giga-token corpus-based protocol which includes a detailed and reproducible methodology sufficient to detect collocational peculiarities of transnational languages. To our knowledge, this is one of the first observational studies of this kind. The paper is organised as follows. Section 1 introduces some basic issues about the translation of collocations against the background of languages’ anisomorphism. Section 2 provides a feature characterisation of collocations. Section 3 deals with the choice of corpora, corpus tools, nodes and patterns. Section 4 covers the automatic retrieval of the selected verb + noun (object) collocations in general Spanish and the co-existing national varieties. Special attention is paid to comparative results in terms of similarities and mismatches. Section 5 presents conclusions and outlines avenues of further research.Published versio

Crossref

Wolverhampton Intellectual Repository and E-theses

The Portrait of Dorian Gray: A corpus-based analysis of translated verb + noun (object) collocations in Peninsular and Colombian Spanish

Author: Corpas Pastor Gloria
Valencia Giraldo M. Victoria
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/01/2022
Field of study

This is an accepted manuscript of an article published by Springer in In: Corpas Pastor G., Mitkov R. (eds) Computational and Corpus-Based Phraseology. EUROPHRAS 2019 on 18/09/2019, available online: https://doi.org/10.1007/978-3-030-30135-4_30 The accepted version of the publication may differ from the final published version.Corpus-based Translation Studies have promoted research on the features of translated language, by focusing on the process and product of translation, from a descriptive perspective. Some of these features have been proposed by Toury [31] under the term of laws of translation, namely the law of growing standardisation and the law of interference. The law of standardisation appears to be particularly at play in diatopy, and more specifically in the case of transnational languages (e.g. English, Spanish, French, German). In fact, some studies have revealed the tendency to standardise the diatopic varieties of Spanish in translated language [8, 9, 11, 12]. This paper focuses on verb + noun (object) collocations of Spanish translations of The Portrait of Dorian Gray by Oscar Wilde. Two different varieties have been chosen (Peninsular and Colombian Spanish). Our main aim is to establish whether the Colombian Spanish translation actually matches the variety spoken in Colombia or it is closer to general or standard Spanish. For this purpose, the techniques used to translate this type of collocations in both Spanish translations will be analysed. Furthermore, the diatopic distribution of these collocations will be studied by means of large corpora.Published versio

Wolverhampton Intellectual Repository and E-theses

Atribuição de autoria em micro-mensagens

Author: Cavalcante Thiago, 1989-
Publication venue: [s.n.]
Publication date: 26/08/2018
Field of study

Orientadores: Ariadne Maria Brito Rizzoni Carvalho, Anderson de Rezende RochaDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática Estatística e Computação CientíficaResumo: Com o crescimento continuo do uso de midias sociais, a atribuição de autoria tem um papel imortante na prevenção dos crimes cibernéticos e na análise de rastros online deixados por assediadores, \textit{bullies}, ladrões de identidade entre outros. Nesta dissertação, nós propusemos um método para atribuição de autoria que é de cem a mil vezes mais rápido que o estado da arte. Nós também obtivemos uma acurácia 65\% na classificação de 50 autores. O método proposto se baseia numa representação de caracteristicas escalável utilizando os padrões das mensagens dos micro-blogs, e também nos utilizamos de um classificador de padrões customizado para lidar com grandes quantidades de dados e alta dimensionalidade. Por fim, nós discutimos a redução do espaço de busca na análise de centenas de suspeitos online e milões de micro mensagens online, o que torna essa abordagem valiosa para forense digital e aplicação das leisAbstract: With the ever-growing use of social media, authorship attribution plays an important role in avoiding cybercrime, and helping the analysis of online trails left behind by cyber pranks, stalkers, bullies, identity thieves and alike. In this dissertation, we propose a method for authorship attribution in micro blogs with efficiency one hundred to a thousand times faster than state-of-the-art counterparts. We also achieved a accuracy of 65% when classifying texts from 50 authors. The method relies on a powerful and scalable feature representation approach taking advantage of user patterns on micro-blog messages, and also on a custom-tailored pattern classifier adapted to deal with big data and high-dimensional data. Finally, we discuss search space reduction when analysing hundreds of online suspects and millions of online micro messages, which makes this approach invaluable for digital forensics and law enforcementMestradoCiência da ComputaçãoMestre em Ciência da Computaçã

Repositorio da Producao Cientifica e Intelectual da Unicamp

Reassessing the Apuleian corpus:Constructing and enacting normality online across generations: The case of social networking sites

Author: Baltes
Brandwood
Eder
Eder
Eder
Eder
Gaisser
Graverini
Harrison
Harrison
Hockey
Horsfall Scotti
Hunink
Hunink
Jannidis
Justin Stover
Kestemont
Kohl
Kroon's
Lee
Lytle
Maggiulli
Marangoni
Marriott
McGann
Meissner
Mike Kestemont
Mosteller
Opeku
Sandy
Schreibman
Schöch
Stover
Stover
Winkler's
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2016
Field of study

Crossref

Edinburgh Research Explorer

Institutional Repository Universiteit Antwerpen