Search CORE

4,307 research outputs found

New directions in corpus-based translation studies

Author: Fantinuoli Claudio
Zanettin Federico
Publication venue
Publication date: 01/01/2015
Field of study

Corpus-based translation studies has become a major paradigm and research methodology and has investigated a wide variety of topics in the last two decades. The contributions to this volume add to the range of corpus-based studies by providing examples of some less explored applications of corpus analysis methods to translation research. They show that the area keeps evolving as it constantly opens up to different frameworks and approaches, from appraisal theory to process-oriented analysis, and encompasses multiple translation settings, including (indirect) literary translation, machine(-assisted) translation and the practical work of professional legal translators. The studies included in the volume also expand the range of application of corpus applications in terms of the tools used to accomplish the research tasks outlined

Institutional Repository of the Freie Universität Berlin

New directions in corpus-based translation studies

Author: Doms Steven
Fantinuoli Claudio
Fantinuoli Claudio
Fotopoulou Angeliki
Lapshinova-Koltunski Ekaterina
Mouka Effie
Neumann Stella
Niemietz Paula
Pontrandolfo Gianluca
Sanz Zuriñe
Saridakis Ioannis E.
Serbina Tatiana
Uribarri Ibon
Zanettin Federico
Zanettin Frederico
Zubillaga Naroa
Publication venue: 'OAPEN Foundation'
Publication date: 01/04/2020
Field of study

Directory of Open Access Books (DOAB)

Machine-assisted mixed methods: augmenting humanities and social sciences with artificial intelligence

Author: Karjus Andres
Publication venue
Publication date: 24/09/2023
Field of study

The increasing capacities of large language models (LLMs) present an unprecedented opportunity to scale up data analytics in the humanities and social sciences, augmenting and automating qualitative analytic tasks previously typically allocated to human labor. This contribution proposes a systematic mixed methods framework to harness qualitative analytic expertise, machine scalability, and rigorous quantification, with attention to transparency and replicability. 16 machine-assisted case studies are showcased as proof of concept. Tasks include linguistic and discourse analysis, lexical semantic change detection, interview analysis, historical event cause inference and text mining, detection of political stance, text and idea reuse, genre composition in literature and film; social network inference, automated lexicography, missing metadata augmentation, and multimodal visual cultural analytics. In contrast to the focus on English in the emerging LLM applicability literature, many examples here deal with scenarios involving smaller languages and historical texts prone to digitization distortions. In all but the most difficult tasks requiring expert knowledge, generative LLMs can demonstrably serve as viable research instruments. LLM (and human) annotations may contain errors and variation, but the agreement rate can and should be accounted for in subsequent statistical modeling; a bootstrapping approach is discussed. The replications among the case studies illustrate how tasks previously requiring potentially months of team effort and complex computational pipelines, can now be accomplished by an LLM-assisted scholar in a fraction of the time. Importantly, this approach is not intended to replace, but to augment researcher knowledge and skills. With these opportunities in sight, qualitative expertise and the ability to pose insightful questions have arguably never been more critical

arXiv.org e-Print Archive

Sentiment Analysis for Words and Fiction Characters From The Perspective of Computational (Neuro-)Poetics

Author: Jacobs Arthur M.
Publication venue
Publication date: 01/01/2019
Field of study

Two computational studies provide different sentiment analyses for text segments (e.g., ‘fearful’ passages) and figures (e.g., ‘Voldemort’) from the Harry Potter books (Rowling, 1997 - 2007) based on a novel simple tool called SentiArt. The tool uses vector space models together with theory-guided, empirically validated label lists to compute the valence of each word in a text by locating its position in a 2d emotion potential space spanned by the > 2 million words of the vector space model. After testing the tool’s accuracy with empirical data from a neurocognitive study, it was applied to compute emotional figure profiles and personality figure profiles (inspired by the so-called ‚big five’ personality theory) for main characters from the book series. The results of comparative analyses using different machine-learning classifiers (e.g., AdaBoost, Neural Net) show that SentiArt performs very well in predicting the emotion potential of text passages. It also produces plausible predictions regarding the emotional and personality profile of fiction characters which are correctly identified on the basis of eight character features, and it achieves a good cross-validation accuracy in classifying 100 figures into ‘good’ vs. ‘bad’ ones. The results are discussed with regard to potential applications of SentiArt in digital literary, applied reading and neurocognitive poetics studies such as the quantification of the hybrid hero potential of figures

Institutional Repository of the Freie Universität Berlin

New directions in corpus-based translation studies

Author: Doms Steven
Fantinuoli Claudio
Fantinuoli Claudio
Fotopoulou Angeliki
Lapshinova-Koltunski Ekaterina
Mouka Effie
Neumann Stella
Niemietz Paula
Pontrandolfo Gianluca
Sanz Zuriñe
Saridakis Ioannis E.
Serbina Tatiana
Uribarri Ibon
Zanettin Federico
Zanettin Frederico
Zubillaga Naroa
Publication venue: 'OAPEN Foundation'
Publication date
Field of study

OAPEN Library

Textpatterns in a computer assisted translator's workstation

Author: Förster Krischan
Gommlich Klaus
Publication venue: Aarhus University, Faculty of Arts, School of Communication and Culture
Publication date: 27/07/1991
Field of study

A software package for a computer-assisted translator's workstation should contain a special module which consists of a database of preferred textual structures in the source and target languages, (TEXTPAT I), as well as a processor of typical translation cases (TEXTPAT II). TEXTPAT I includes micro- and macrostructures at four levels (text type, text type variants, chunks, syntactic and lexical structures). TEXTPAT II consists of lists of items for which translation rules have to be applied. Both textpats contribute to the idea of a translator's expert system

Tidsskrift.dk (Det Kongelige Bibliotek)