58 research outputs found
On the Mono- and Cross-Language Detection of Text Re-Use and Plagiarism
Barrón Cedeño, LA. (2012). On the Mono- and Cross-Language Detection of Text Re-Use and Plagiarism [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/16012Palanci
Information-theoretic causal inference of lexical flow
This volume seeks to infer large phylogenetic networks from phonetically encoded lexical data and contribute in this way to the historical study of language varieties. The technical step that enables progress in this case is the use of causal inference algorithms. Sample sets of words from language varieties are preprocessed into automatically inferred cognate sets, and then modeled as information-theoretic variables based on an intuitive measure of cognate overlap. Causal inference is then applied to these variables in order to determine the existence and direction of influence among the varieties. The directed arcs in the resulting graph structures can be interpreted as reflecting the existence and directionality of lexical flow, a unified model which subsumes inheritance and borrowing as the two main ways of transmission that shape the basic lexicon of languages. A flow-based separation criterion and domain-specific directionality detection criteria are developed to make existing causal inference algorithms more robust against imperfect cognacy data, giving rise to two new algorithms. The Phylogenetic Lexical Flow Inference (PLFI) algorithm requires lexical features of proto-languages to be reconstructed in advance, but yields fully general phylogenetic networks, whereas the more complex Contact Lexical Flow Inference (CLFI) algorithm treats proto-languages as hidden common causes, and only returns hypotheses of historical contact situations between attested languages. The algorithms are evaluated both against a large lexical database of Northern Eurasia spanning many language families, and against simulated data generated by a new model of language contact that builds on the opening and closing of directional contact channels as primary evolutionary events. The algorithms are found to infer the existence of contacts very reliably, whereas the inference of directionality remains difficult. This currently limits the new algorithms to a role as exploratory tools for quickly detecting salient patterns in large lexical datasets, but it should soon be possible for the framework to be enhanced e.g. by confidence values for each directionality decision
Information-theoretic causal inference of lexical flow
This volume seeks to infer large phylogenetic networks from phonetically encoded lexical data and contribute in this way to the historical study of language varieties. The technical step that enables progress in this case is the use of causal inference algorithms. Sample sets of words from language varieties are preprocessed into automatically inferred cognate sets, and then modeled as information-theoretic variables based on an intuitive measure of cognate overlap. Causal inference is then applied to these variables in order to determine the existence and direction of influence among the varieties. The directed arcs in the resulting graph structures can be interpreted as reflecting the existence and directionality of lexical flow, a unified model which subsumes inheritance and borrowing as the two main ways of transmission that shape the basic lexicon of languages
Recommended from our members
Interpreting
What do community interpreting for the Deaf in western societies, conference interpreting for the European Parliament, and language brokering in international management have in common? Academic research and professional training have historically emphasized the linguistic and cognitive challenges of interpreting, neglecting or ignoring the social aspects that structure communication. All forms of interpreting are inherently social; they involve relationships among at least three people and two languages. The contexts explored here, American Sign Language/English interpreting and spoken language interpreting within the European Parliament, show that simultaneous interpreting involves attitudes, norms and values about intercultural communication that overemphasize information and discount cultural identity.
The default mode of interpreting shows a desire for speed that suppresses differences requiring cultural mediation. It is theorized this imbalance stems from the invention and implementation of simultaneous interpreting within a highly charged historical moment that was steeped in trauma. Interpreting as a professional practice developed in keeping with technological capacities and historical contingencies accompanying processes of industrialization and modernity. The resulting expectations about what interpreting can and cannot achieve play out in microsocial group dynamics (as inequality) and macrosocial policy (legalized injustice).
Interpreting invites an encounter with difference: foreignization is embedded within the experience of participating in simultaneous interpretation because interpreting disrupts the accustomed flow of consciousness, forcing participants to adapt (or resist adapting) to an alternate rhythm of turn-taking. This results in an unusual awareness of time. Discomforts associated with heightened time-consciousness open possibilities for deep learning and new kinds of relationships among people, ideas, and problem-setting.
An analysis of the frustrations of users (interpretees) and practitioners (interpreters) suggests the need for other remedies than complete domestication. Reframing training for interpreters, and cultivating skillful and strategic participation by interpretees, could be leveraged systematically to improve social equality and reduce intercultural tensions through a balanced emphasis on sharing understanding and creating mutually-relevant meanings. This comparative cultural and critical discourse analysis enables an action research/action learning hypothesis aimed at intercultural social resilience: social control of diversity can be calibrated and contained through rituals of participation in special practices of simultaneously-interpreted communication
The construction of Erasmus student identity: a discourse historic approach
This thesis examines the construction of a student mobility programme and mobile students’ identities in discourses of Erasmus exchange students (bottom-up discourses) and political speeches and institutional texts (top-down discourses). By adopting a post-modern perspective on identity and its construction in discourse, this study intends to fill the gap in the field of student mobility research, which has been predominantly concerned with North American, rather than European, or even less so with the Latvian context and has been mainly quantitative in nature, looking at large-scale statistical data, while overlooking the complexities and variation among individual experiences. The study applies the Discourse Historical Approach (DHA) to three sets of data: individual interviews with incoming Erasmus exchange students in Latvia, political speeches by the former EU Minister of Education, A. Vassiliou and online texts published on the web page of the Latvian State Education Agency. The results indicate that mobile European exchange students’ identities are constructed differently in institutional as opposed to the experiential contexts. It seems that on the one hand, Latvian institutional texts focus on building a positive representation of Latvia, characterised by openness and its affiliations with Europe and the world as the outcome of the Erasmus programme; the EU political discourse promotes the triumph of Erasmus as a European project, pointing to the vitality of the student mobility programme leading to an increase in the number of people with European identity as the actual proof of the programme’s success. Contrary to the institutional online texts and the Commissioner’s speeches, on the other hand, the Erasmus students indicate their awareness of the complex, multiple and changing nature of mobile students’ identities and their construction in discourse when faced with new contexts and diverse individuals
The construction of Erasmus student identity: a discourse historic approach
This thesis examines the construction of a student mobility programme and mobile students’ identities in discourses of Erasmus exchange students (bottom-up discourses) and political speeches and institutional texts (top-down discourses). By adopting a post-modern perspective on identity and its construction in discourse, this study intends to fill the gap in the field of student mobility research, which has been predominantly concerned with North American, rather than European, or even less so with the Latvian context and has been mainly quantitative in nature, looking at large-scale statistical data, while overlooking the complexities and variation among individual experiences. The study applies the Discourse Historical Approach (DHA) to three sets of data: individual interviews with incoming Erasmus exchange students in Latvia, political speeches by the former EU Minister of Education, A. Vassiliou and online texts published on the web page of the Latvian State Education Agency. The results indicate that mobile European exchange students’ identities are constructed differently in institutional as opposed to the experiential contexts. It seems that on the one hand, Latvian institutional texts focus on building a positive representation of Latvia, characterised by openness and its affiliations with Europe and the world as the outcome of the Erasmus programme; the EU political discourse promotes the triumph of Erasmus as a European project, pointing to the vitality of the student mobility programme leading to an increase in the number of people with European identity as the actual proof of the programme’s success. Contrary to the institutional online texts and the Commissioner’s speeches, on the other hand, the Erasmus students indicate their awareness of the complex, multiple and changing nature of mobile students’ identities and their construction in discourse when faced with new contexts and diverse individuals
CLARIN. The infrastructure for language resources
CLARIN, the "Common Language Resources and Technology Infrastructure", has established itself as a major player in the field of research infrastructures for the humanities. This volume provides a comprehensive overview of the organization, its members, its goals and its functioning, as well as of the tools and resources hosted by the infrastructure. The many contributors representing various fields, from computer science to law to psychology, analyse a wide range of topics, such as the technology behind the CLARIN infrastructure, the use of CLARIN resources in diverse research projects, the achievements of selected national CLARIN consortia, and the challenges that CLARIN has faced and will face in the future.
The book will be published in 2022, 10 years after the establishment of CLARIN as a European Research Infrastructure Consortium by the European Commission (Decision 2012/136/EU)
CLARIN
The book provides a comprehensive overview of the Common Language Resources and Technology Infrastructure – CLARIN – for the humanities. It covers a broad range of CLARIN language resources and services, its underlying technological infrastructure, the achievements of national consortia, and challenges that CLARIN will tackle in the future. The book is published 10 years after establishing CLARIN as an Europ. Research Infrastructure Consortium
- …