2,795 research outputs found

    Textual Stylistic Variation: Choices, Genres and Individuals

    Get PDF
    This chapter argues for more informed target metrics for the statistical processing of stylistic variation in text collections. Much as operationalized relevance proved a useful goal to strive for in information retrieval, research in textual stylistics, whether application oriented or philologically inclined, needs goals formulated in terms of pertinence, relevance, and utility — notions that agree with reader ex- perience of text. Differences readers are aware of are mostly based on utility — not on textual characteristics per se. Mostly, readers report stylistic differences in terms of genres. Genres, while vague and undefined, are well-established and talked about: very early on, readers learn to distinguish genres. This chapter discusses variation given by genre, and contrasts it to variation occasioned by individual choice

    Linguistics in the Study and Teaching of Literature

    Get PDF
    Literary texts include linguistic form, as well as specialized literary forms (some of which also involve language). Linguistics can offer to literary studies an understanding of these kinds of form, and the ways by which a text is used to communicate meaning. In order to cope with the great variety of creative uses of language in literature, linguistics must acknowledge that some texts are assigned structure by non-linguistic means, but the boundaries between linguistic and non-linguistic explanations for literary language are not clearly drawn. The article concludes with discussion of what kinds and level of linguistics might usefully be taught in a literature classroom, and offers practical suggestions for the application of linguistics to literature teaching

    Computing the Affective-Aesthetic Potential of Literary Texts

    Get PDF
    In this paper, we compute the affective-aesthetic potential (AAP) of literary texts by using a simple sentiment analysis tool called SentiArt. In contrast to other established tools, SentiArt is based on publicly available vector space models (VSMs) and requires no emotional dictionary, thus making it applicable in any language for which VSMs have been made available (>150 so far) and avoiding issues of low coverage. In a first study, the AAP values of all words of a widely used lexical databank for German were computed and the VSM’s ability in representing concrete and more abstract semantic concepts was demonstrated. In a second study, SentiArt was used to predict ~2800 human word valence ratings and shown to have a high predictive accuracy (R2 > 0.5, p < 0.0001). A third study tested the validity of SentiArt in predicting emotional states over (narrative) time using human liking ratings from reading a story. Again, the predictive accuracy was highly significant: R2adj = 0.46, p < 0.0001, establishing the SentiArt tool as a promising candidate for lexical sentiment analyses at both the micro- and macrolevels, i.e., short and long literary materials. Possibilities and limitations of lexical VSM-based sentiment analyses of diverse complex literary texts are discussed in the light of these results

    O uniwersaliach tłumaczeniowych w wybranych współczesnych polskich tłumaczeniach literackich

    Get PDF
    Niniejsze badanie o charakterze pilotażowym dotyczy wykorzystania wybranych metod badawczych językoznawstwa korpusowego i stylistyki komputerowej w analizie uniwersaliów tłumaczeniowych na materiale wybranych współczesnych polskich tłumaczeń literackich. Mówiąc ściślej, badanie dotyczy wybranych uniwersaliów typu T (za Chestermanem 2004), które nazywam uniwersaliami tłumaczeniowymi wewnątrz-językowymi (Grabowski 2011), takich jak kluczowe wzorce leksykalne (corepatterns of lexicaluse; Laviosa 2002) oraz hipoteza dotycząca konwergencji (levelling-out; Baker 1996). W celu przeprowadzenia niniejszego badania opracowano dwa specjalne korpusy badawcze (z 500 000 wyrazów tekstowych w każdym) obejmujące wybrane współczesne polskie powieści oraz wybrane współczesne tłumaczenia literackie z języka angielskiego na język polski. Wyniki badania wykazały, że jako całość teksty tłumaczone są bardziej zróżnicowane leksykalnie od tekstów nietłumaczonych, ale też cechują się większą liczbą powtórzeń i mniejszym zróżnicowaniem leksykalnym jeśli idzie o wyrazy o wysokiej frekwencji w tekście. Z drugiej strony badanie wykazało, że teksty nietłumaczone cechują się większym bogactwem leksykalnym w zakresie wyrazów o niskiej frekwencji w tekście, gdzie z reguły można znaleźć słownictwo kreatywne i odautorskie. Metody wielowymiarowe (analiza głównych składowych, analiza skupień) potwierdziła hipotezę dotyczącą konwergencji, zgodnie z którą można zaobserwować większe podobieństwo między tekstami tłumaczonymi niż między tekstami tłumaczonymi a oryginałami napisanymi w tym samym języku.This pilot study attempts to examine the potential of selected corpus linguistics and computational stylistics methods in the investigation of translation universals in translational literary Polish. More specifically, the study deals with T-universals (after Chesterman 2004), which are also referred to as intralingual translation universals (Grabowski 2011), with emphasis on core patterns of lexical use, as proposed by Laviosa (1998, 2002), and the leveling-out hypothesis, as proposed by Baker (1996). To that end, the custom-designed corpora,with approximately 500,000 tokens each, of contemporary translational and non-translational literary Polish were compiled. The results of the study reveal that on the whole translated texts are more varied lexically and have more repetitions and lower lexical variety among top-frequency words than non-translated Polish texts. On the other hand, the study shows that non-translational texts have higher lexical variety among bottom-frequency words, where usually one can find author-specific and creative vocabulary. The results of multivariate methods (Principal Components Analysis and Cluster Analysis) confirm the leveling-out hypothesis that translations are more alike as compared with native texts

    ATMS-Based architecture for stylistics-aware text generation

    Get PDF
    This thesis is concerned with the effect of surface stylistic constraints (SSC) on syntactic and lexical choice within a unified generation architecture. Despite the fact that these issues have been investigated by researchers in the field, little work has been done with regard to system architectures that allow surface form constraints to influence earlier linguistic or even semantic decisions made throughout the NLG process. By SSC we mean those stylistic requirements that are known beforehand but cannot be tested until after the utterance or — in some lucky cases — until a proper linearised part of it has been generated. These include collocational constraints, text size limits, and poetic aspects such as rhyme and metre to name a few. This thesis introduces a new NLG architecture that can be sensitive to surface stylistic requirements. It brings together a well-founded linguistic theory that has been used in many successful NLG systems (Systemic Functional Linguistics, SFL) and an exist¬ ing AI search mechanism (the Assumption-based Truth Maintenance System, ATMS) which caches important search information and avoids work duplication. To this end, the thesis explores the logical relation between the grammar formalism and the search technique. It designs, based on that logical connection, an algorithm for the automatic translation of systemic grammar networks to ATMS dependency networks. The generator then uses the translated networks to generate natural language texts with a high paraphrasing power as a direct result of its ability to pursue multiple paths simultaneously. The thesis approaches the crucial notion of choice differently to previ¬ ous systems using SFL. It relaxes the choice process in that choosers are not obliged to deterministically choose a single alternative allowing SSC to influence the final lexical and syntactic decisions. The thesis also develops a situation-action framework for the specification of stylistic requirements independently of the micro-semantic input. The user or application can state what surface requirements they wish to impose and the ATMS-based generator then attempts to satisfy these constraints. Finally, a prototype ATMS-based generation system embodying the ideas presented in this thesis is implemented and evaluated. We examine the system's stylistic sensitivity by testing it on three different sets of stylistic requirements, namely: collocational, size, and poetic constraints

    Building a resource for studying translation shifts

    Full text link
    This paper describes an interdisciplinary approach which brings together the fields of corpus linguistics and translation studies. It presents ongoing work on the creation of a corpus resource in which translation shifts are explicitly annotated. Translation shifts denote departures from formal correspondence between source and target text, i.e. deviations that have occurred during the translation process. A resource in which such shifts are annotated in a systematic way will make it possible to study those phenomena that need to be addressed if machine translation output is to resemble human translation. The resource described in this paper contains English source texts (parliamentary proceedings) and their German translations. The shift annotation is based on predicate-argument structures and proceeds in two steps: first, predicates and their arguments are annotated monolingually in a straightforward manner. Then, the corresponding English and German predicates and arguments are aligned with each other. Whenever a shift - mainly grammatical or semantic -has occurred, the alignment is tagged accordingly.Comment: 6 pages, 1 figur
    corecore