685 research outputs found

    Patterns and Variation in English Language Discourse

    Get PDF
    The publication is reviewed post-conference proceedings from the international 9th Brno Conference on Linguistics Studies in English, held on 16–17 September 2021 and organised by the Faculty of Education, Masaryk University in Brno. The papers revolve around the themes of patterns and variation in specialised discourses (namely the media, academic, business, tourism, educational and learner discourses), effective interaction between the addressor and addressees and the current trends and development in specialised discourses. The principal methodological perspectives are the comparative approach involving discourses in English and another language, critical and corpus analysis, as well as identification of pragmatic strategies and appropriate rhetorical means. The authors of papers are researchers from the Czech Republic, Italy, Luxembourg, Serbia and Georgia

    A Review of Deep Learning Models for Twitter Sentiment Analysis: Challenges and Opportunities

    Get PDF
    Microblogging site Twitter (re-branded to X since July 2023) is one of the most influential online social media websites, which offers a platform for the masses to communicate, expresses their opinions, and shares information on a wide range of subjects and products, resulting in the creation of a large amount of unstructured data. This has attracted significant attention from researchers who seek to understand and analyze the sentiments contained within this massive user-generated text. The task of sentiment analysis (SA) entails extracting and identifying user opinions from the text, and various lexicon-and machine learning-based methods have been developed over the years to accomplish this. However, deep learning (DL)-based approaches have recently become dominant due to their superior performance. This study briefs on standard preprocessing techniques and various word embeddings for data preparation. It then delves into a taxonomy to provide a comprehensive summary of DL-based approaches. In addition, the work compiles popular benchmark datasets and highlights evaluation metrics employed for performance measures and the resources available in the public domain to aid SA tasks. Furthermore, the survey discusses domain-specific practical applications of SA tasks. Finally, the study concludes with various research challenges and outlines future outlooks for further investigation

    Referring to discourse participants in Ibero-Romance languages

    Get PDF
    Synopsis: This volume brings together contributions by researchers focusing on personal pronouns in Ibero-Romance languages, going beyond the well-established variable of expressed vs. non-expressed subjects. While factors such as agreement morphology, topic shift and contrast or emphasis have been argued to account for variable subject expression, several corpus studies on Ibero-Romance languages have shown that the expression of subject pronouns goes beyond these traditionally established factors and is also subject to considerable dialectal variation. One of the factors affecting choice and expression of personal pronouns or other referential devices is whether the construction is used personally or impersonally. The use and emergence of new impersonal constructions, eventually also new (im)personal pronouns, as well as the variation found in the expression of human impersonality in different Ibero-Romance language varieties is another interesting research area that has gained ground in the recent years. In addition to variable subject expression, similar methods and theoretical approaches have been applied to study the expression of objects. Finally, the reference to the addressee(s) using different address pronouns and other address forms is an important field of study that is closely connected to the variable expression of pronouns. The present book sheds light on all these aspects of reference to discourse participants. The volume contains contributions with a strong empirical background and various methods and both written and spoken corpus data from Ibero-Romance languages. The focus on discourse participants highlights the special properties of first and second person referents and the factors affecting them that are often different from the anaphoric third person. The chapters are organized into three thematic sections: (i) Variable expression of subjects and objects, (ii) Between personal and impersonal, and (iii) Reference to the addressee

    Comparing the production of a formula with the development of L2 competence

    Get PDF
    This pilot study investigates the production of a formula with the development of L2 competence over proficiency levels of a spoken learner corpus. The results show that the formula in beginner production data is likely being recalled holistically from learners’ phonological memory rather than generated online, identifiable by virtue of its fluent production in absence of any other surface structure evidence of the formula’s syntactic properties. As learners’ L2 competence increases, the formula becomes sensitive to modifications which show structural conformity at each proficiency level. The transparency between the formula’s modification and learners’ corresponding L2 surface structure realisations suggest that it is the independent development of L2 competence which integrates the formula into compositional language, and ultimately drives the SLA process forward

    A Primer on Seq2Seq Models for Generative Chatbots

    Get PDF
    The recent spread of Deep Learning-based solutions for Artificial Intelligence and the development of Large Language Models has pushed forwards significantly the Natural Language Processing area. The approach has quickly evolved in the last ten years, deeply affecting NLP, from low-level text pre-processing tasks –such as tokenisation or POS tagging– to high-level, complex NLP applications like machine translation and chatbots. This paper examines recent trends in the development of open-domain data-driven generative chatbots, focusing on the Seq2Seq architectures. Such architectures are compatible with multiple learning approaches, ranging from supervised to reinforcement and, in the last years, allowed to realise very engaging open-domain chatbots. Not only do these architectures allow to directly output the next turn in a conversation but, to some extent, they also allow to control the style or content of the response. To offer a complete view on the subject, we examine possible architecture implementations as well as training and evaluation approaches. Additionally, we provide information about the openly available corpora to train and evaluate such models and about the current and past chatbot competitions. Finally, we present some insights on possible future directions, given the current research status

    Computer Vision and Architectural History at Eye Level:Mixed Methods for Linking Research in the Humanities and in Information Technology

    Get PDF
    Information on the history of architecture is embedded in our daily surroundings, in vernacular and heritage buildings and in physical objects, photographs and plans. Historians study these tangible and intangible artefacts and the communities that built and used them. Thus valuableinsights are gained into the past and the present as they also provide a foundation for designing the future. Given that our understanding of the past is limited by the inadequate availability of data, the article demonstrates that advanced computer tools can help gain more and well-linked data from the past. Computer vision can make a decisive contribution to the identification of image content in historical photographs. This application is particularly interesting for architectural history, where visual sources play an essential role in understanding the built environment of the past, yet lack of reliable metadata often hinders the use of materials. The automated recognition contributes to making a variety of image sources usable forresearch.<br/

    Workshop Proceedings of the 12th edition of the KONVENS conference

    Get PDF
    The 2014 issue of KONVENS is even more a forum for exchange: its main topic is the interaction between Computational Linguistics and Information Science, and the synergies such interaction, cooperation and integrated views can produce. This topic at the crossroads of different research traditions which deal with natural language as a container of knowledge, and with methods to extract and manage knowledge that is linguistically represented is close to the heart of many researchers at the Institut fĂŒr Informationswissenschaft und Sprachtechnologie of UniversitĂ€t Hildesheim: it has long been one of the institute’s research topics, and it has received even more attention over the last few years

    Modeling and Predicting Literary Reception. A Data-Rich Approach to Literary Historical Reception

    Get PDF
    This contribution exemplifies a workflow for the quantitative operationalization and analysis of historical literary reception. We will show how to encode literary historical information in a dataset that is suitable for quantitative analysis and present a nuanced and theory-based perspective on automated sentiment detection in historical literary reviews. Applying our method to corpora of English and German novels and narratives published from 1688 to 1914 and corresponding reviews and circulating library catalogs, we investigate if a text’s popularity with lay audiences, the attention from contemporary experts or the sentiment in experts’ reviews can be predicted from textual features, with the aim of contributing to the understanding of how literary reception as a social process can be linked to textual qualities

    Translating Islamic Law: the postcolonial quest for minority representation

    Get PDF
    This research sets out to investigate how culture-specific or signature concepts are rendered in English-language discourse on Islamic, or ‘shariÊża’ law, which has Arabic roots. A large body of literature has investigated Islamic law from a technical perspective. However, from the perspective of linguistics and translation studies, little attention has been paid to the lexicon that makes up this specialised discourse. Much of the commentary has so far been prescriptive, with limited empirical evidence. This thesis aims to bridge this gap by exploring how ‘culturalese’ (i.e., ostensive cultural discourse) travels through language, as evidenced in the self-built Islamic Law Corpus (ILC), a 9-million-word monolingual English corpus, covering diverse genres on Islamic finance and family law. Using a mixed methods design, the study first quantifies the different linguistic strategies used to render shariÊża-based concepts in English, in order to explore ‘translation’ norms based on linguistic frequency in the corpus. This quantitative analysis employs two models: profile-based correspondence analysis, which considers the probability of lexical variation in expressing a conceptual category, and logistic regression (using MATLAB programming software), which measures the influence of the explanatory variables ‘genre’, ‘legal function’ and ‘subject field’ on the choice between an Arabic loanword and an endogenous English lexeme, i.e., a close English equivalent. The findings are then interpreted qualitatively in the light of postcolonial translation agendas, which aim to preserve intangible cultural heritage and promote the representation of minoritised groups. The research finds that the English-language discourse on Islamic law is characterised by linguistic borrowing and glossing, implying an ideologically driven variety of English that can be usefully labelled as a kind of ‘Islamgish’ (blending ‘Islamic’ and ‘English’) aimed at retaining symbols of linguistic hybridity. The regression analysis confirms the influence of the above-mentioned contextual factors on the use of an Arabic loanword versus English alternatives
    • 

    corecore