2,584 research outputs found

    Text Style Transfer: A Review and Experimental Evaluation

    Full text link
    The stylistic properties of text have intrigued computational linguistics researchers in recent years. Specifically, researchers have investigated the Text Style Transfer (TST) task, which aims to change the stylistic properties of the text while retaining its style independent content. Over the last few years, many novel TST algorithms have been developed, while the industry has leveraged these algorithms to enable exciting TST applications. The field of TST research has burgeoned because of this symbiosis. This article aims to provide a comprehensive review of recent research efforts on text style transfer. More concretely, we create a taxonomy to organize the TST models and provide a comprehensive summary of the state of the art. We review the existing evaluation methodologies for TST tasks and conduct a large-scale reproducibility study where we experimentally benchmark 19 state-of-the-art TST algorithms on two publicly available datasets. Finally, we expand on current trends and provide new perspectives on the new and exciting developments in the TST field

    Stylistický rozbor jazykových prostředků v rozhlasových debatách v angličtině a v češtině

    Get PDF
    Tato diplomová práce se zabývá stylisticky příznakovými prostředky ve dvou rozhlasových debatách: pořadu BBC Radia 4 nazvaném Any Questions? a českém Speciálu Martina Veselovského, pořadu vysílaného Českým rozhlasem 1 na stanici Radiožurnál. Stylisticky příznakové jsou ty prostředky, jejichž výskyt je omezen na určitý kontext. V případě této diplomové práce jsou tím kontextem námi analyzované debaty vysílané veřejnoprávními institucemi. Prostředky, které jsou v průběhu analýzy označeny za příznakové, jsou popsány na morfologické, syntaktické a lexikální rovině, rozděleny do kategorií podle funkce a následně identifikovány jako spisovné či nespisovné. Zatímco se některé příznakové prostředky vyskytují v obou zkoumaných jazycích, některé jsou charakteristické jen pro jeden z jazyků, neboť jazykové systémy češtiny a angličtiny se navzájem liší. Zkoumána je také četnost výskytu jednotlivých prostředků. Z těchto a dalších zjištění je potom vyvozen závěr o tom, do jaké míry mohou být obě debaty označeny za neformální.This MA thesis focuses on the stylistically marked features that occur in an English radio debate called Any Questions? aired by BBC - Radio 4 and a Czech radio debate called Speciál Martina Veselovského aired on Český rozhlas 1 - Radiožurnál. Stylistically marked features are restricted to certain kinds of social context: in the case of this thesis, it is two radio debates broadcast by public service media. Those linguistics features that are considered stylistically marked in the two debates are identified on the morphological, syntactical and lexical level, and classified into categories based upon a view of their functions. Subsequently, they are described as standard or nonstandard. Some of the features found are shared by both debates. However, some are, due to the different language systems concerned, symptomatic of only one of the languages. The difference between the English and the Czech stylistically marked features is also revealed as to the frequency of their occurrence. Finally, the conclusions about the level of informality of the two debates are drawn.Ústav anglického jazyka a didaktikyDepartment of the English Language and ELT MethodologyFilozofická fakultaFaculty of Art

    Seeking systematicity in variation : theoretical and methodological considerations on the “variety” concept

    Get PDF
    One centennial discussion in linguistics concerns whether languages, or linguistic systems, are, essentially, homogeneous or rather show “structured heterogeneity.” In this contribution, the question is addressed whether and how sociolinguistically defined systems (or ‘varieties’) are to be distinguished in a heterogeneous linguistic landscape: to what extent can structure be found in the myriads of language variants heard in everyday language use? We first elaborate on the theoretical importance of this ‘variety question’ by relating it to current approaches from, among others, generative linguistics (competing grammars), sociolinguistics (style-shifting, polylanguaging), and cognitive linguistics (prototype theory). Possible criteria for defining and detecting varieties are introduced, which are subsequently tested empirically, using a self-compiled corpus of spoken Dutch in West Flanders (Belgium). This empirical study demonstrates that the speech repertoire of the studied West Flemish speakers consists of four varieties, viz. a fairly stable dialect variety, a more or less virtual standard Dutch variety, and two intermediate varieties, which we will label ‘cleaned-up dialect’ and ‘substandard.’ On the methodological level, this case-study underscores the importance of speech corpora comprising both inter- and intra-speaker variation on the one hand, and the merits of triangulating qualitative and quantitative approaches on the other

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Get PDF
    This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 tabl

    Pope Francis’s Laudato Si’: A corpus study of environmental and religious discourse

    Get PDF
    This paper explores aspects of the lexico-grammar of religiously oriented environmental discourse produced by a leading religious authority, Pope Francis. It examines the most frequent keywords and keyword clusters of the encyclical letter Laudato Si’ against popularised updates on scientific and technological advances available on the NASA website. The findings show that Laudato Si’ draws attention both to how people’s behaviour affects the environment and to its relevance to the current political and economic situation. The Letter also calls for a much-needed caring attitude towards the environment, and thus appears to be characterized by the directive communicative function throughout, while presenting a more specific religious slant only in select chapters. The analysis carried out highlights both the topics and the rhetorical goals of the discourse of Pope Francis

    Pope Francis's Laudato Si': A corpus study of environmental and religious discourse

    Get PDF
    This paper explores aspects of the lexico-grammar of religiously oriented environmental discourse produced by a leading religious authority, Pope Francis. It examines the most frequent keywords and keyword clusters of the encyclical letter "Laudato Si\u2019" against popularised updates on scientific and technological advances available on the NASA website. The findings show that "Laudato Si\u2019" draws attention both to how people\u2019s behaviour affects the environment and to its relevance to the current political and economic situation. The Letter also calls for a much-needed caring attitude towards the environment, and thus appears to be characterized by the directive communicative function throughout, while presenting a more specific religious slant only in select chapters. The analysis carried out highlights both the topics and the rhetorical goals of the discourse of Pope Francis

    Rules of Engagement: Architecture Theory and the Social Sciences in Frank Duffy’s 1974 Thesis on Office Planning

    Get PDF
    This paper addresses the broad shift that took place in architectural theory and education in the 70s, where models of the discipline asserting the autonomy of architecture eclipsed models privileging architecture’s ties to other disciplines, particularly technology and the social sciences. With Frank Duffy's Princeton thesis on open office planning (1974) as a focus, the paper explores the theoretical and institutional contexts of this shift and offers a critical reappraisal in light of contemporary issues facing architecture.architectural theory, office space, planning, architectural education

    The Linguistic Construction of Epistemological Difference

    Get PDF
    PhDHow are beliefs about the nature of knowledge reflected and reproduced in language use? It is clear that some linguistic resources, e.g. the modal verbs may and must, indicate one’s epistemic stance with respect to a proposition, i.e. one’s judgement of how likely it is to be true. What is less clear is how the use of such resources relates to speakers’ beliefs about the nature of knowledge per se, i.e. their epistemic policies (Teller 2004). To investigate the putative relationship between epistemological variation and linguistic variation, I examine samples of written and spoken English from a community that is particularly epistemologically diverse: academia. I synthesize research on social epistemology, sociolinguistics, linguistic anthropology, and Academic English (AE) to propose an explanatory model of variability in the expression of epistemic stance. Then, using AE as a case study, I evaluate the predictions of this model both quantitatively via corpus analysis of research articles and regression modelling of interview data, as well as qualitatively via analysis of discursive practices in terms of experience-organizing frames (Goffman 1974) and the semiotic notion of indexicality (e.g. Irvine 2001), whereby ideological differences produce, and are reproduced by, linguistic differences. This research makes contributions to a number of fields. It questions the analytic validity of disciplinarity, providing support for a unifying theory of variation in AE based instead on an epistemologically principled analysis of institutional language use. The indexical basis of sociolinguistic research on language and belief/identity is problematized by attending to epistemological context; the ramifications of this will be explored in future research. I develop a linguistic metric of epistemic belief, offering a means of developing a quantitative social epistemology to complement that field’s highly articulated theoretical work. Applications beyond academia are possible in areas concerned with knowledge management and transfer, such as public health.Queen Mary School of Languages, Linguistics and Film Research Studentship; AHRC Block Grant Partnership PhD Studentship in Linguistics

    Reinventing our understanding of the Left-Right political dichotomy: the case of Argentina

    Get PDF
    What happens to a country’s political culture once populism takes root? Have Global North-centered methods of evaluation miscategorized Global South political party identification both historically and contemporaneously? As the world grapples with the continued rise of populism and its divisive rhetoric, scholars must thoroughly examine the movement’s spheres of influence beyond traditionally accepted frameworks. Understanding populist parties is vital, for they oftentimes create staggering disruptions within a nation’s political culture. These disturbances become starkly apparent in times of crises as challenges plunge everyday citizens deeper into the political sphere. The case of Argentina allows for an examination of the ways in which populism has created a reality wherein ideology is no more than background noise in political clashes. By interrogating Argentina’s Peronist movement, its destabilization of Argentinian institutions and norms, and its ever-adaptable nature this research establishes a contextually-based method of understanding populist political identities. I argue that the right-left dichotomy is not constructive in describing socio-political environments wherein populism has become the dominant political narrative. The diversity, heterogeneity, and complexity of political realities in the Global South permit us to fruitfully revisit traditional views of an assumed left-right ideological spectrum

    Style in the vernacular and on the radio: code-switching and mutation as stylistic and social markers in Welsh

    Get PDF
    This thesis seeks to analyse two types of linguistic features of Welsh, code-switching and mutation, as sociolinguistic variables: features which encode social information about the speaker and/or stylistic meaning. Developing a study design that incorporates an analysis of code-switching and mutation in naturalistic speech has demanded a relatively novel methodological approach. The study combined a variationist analysis of the vernacular use of both variables in the 40-hour Siarad corpus (Deuchar 2014) with a technique that ranks radio programmes in order of formality through the use of channel cues and other criteria (Ball et al 1988). This allows for a comparison of the use of code-switching and mutation in multiple stylistic contexts, each of which show varying degrees of emotional engagement and self-monitoring by speakers. The analysis found that code-switching was strongly correlated with the level of formality of each radio programme, and that at least one aspirate mutation trigger, (a), also patterned in a similar way. Some other mutation triggers, most notably including the nasal possessive trigger (fy), seemed to be primarily affected by the speakers’ backgrounds and their relative ages in particular. A qualitative analysis of the type of discourse found in each radio programme made further links between the institutional style of each programme and their use of the stylistically controlled ‘marker’ variables, with non-standard variants appearing to be indexical of solidarity, subversion and irony, while standard variants indexed prestige, authority and earnestness
    corecore