Search CORE

2,825 research outputs found

"Our interaction was very productive": levels of reflection in learners’ diaries in teletandem

Author: Cavalari Spatti M. Suzi
Leone Paola
Solange Aranha
Publication venue
Publication date: 01/01/2023
Field of study

This investigation aims to analyze what Brazilian participants say about teletandem interaction in their diaries and how (or if) they reflect upon it. Data are elicited from 350 diaries written in English and stored in MulTeC (Multimodal Teletandem Corpus) (Aranha, Lopes, 2019) and compiled by 333 fragments of text in which the most frequent word “interaction” (Leone et al., ongoing) occurs. The analytical framework is based on Moon´s (2004) reflective writing and Garcia et.al (2017) proposal of metacognitive operations. Results reveal that 41% of the fragments featured mere descriptions of the events that occurred during the interaction, while most (59%) of them presented some degree of reflection. In our data, reflection seems to be based on: (i) the assessment of the interaction, in line with Garcia et al. (2017)´s proposal that assessment represents a frequent metacognitive operation; (ii) recognition of different elements that are relevant for language learning in teletandem, i.e., partner’s collaboration and task specificities. Our findings also indicate that the other metacognitive operations (setting goals, planning, selecting resources, managing emotions) support or motivate the judgments made by learners. This finding corroborates the notion proposed in other research (Moon, 2004; 2010) that writing diaries may promote reflection, which is a fundamental aspect of autonomous learning

Archivio Istituzionale della Ricerca- Università del Salento

Using attention methods to predict judicial outcomes

Author: Bertalan Vithor Gomes Ferreira
Ruiz Evandro Eduardo Seron
Publication venue
Publication date: 18/07/2022
Field of study

Legal Judgment Prediction is one of the most acclaimed fields for the combined area of NLP, AI, and Law. By legal prediction we mean an intelligent systems capable to predict specific judicial characteristics, such as judicial outcome, a judicial class, predict an specific case. In this research, we have used AI classifiers to predict judicial outcomes in the Brazilian legal system. For this purpose, we developed a text crawler to extract data from the official Brazilian electronic legal systems. These texts formed a dataset of second-degree murder and active corruption cases. We applied different classifiers, such as Support Vector Machines and Neural Networks, to predict judicial outcomes by analyzing textual features from the dataset. Our research showed that Regression Trees, Gated Recurring Units and Hierarchical Attention Networks presented higher metrics for different subsets. As a final goal, we explored the weights of one of the algorithms, the Hierarchical Attention Networks, to find a sample of the most important words used to absolve or convict defendants

arXiv.org e-Print Archive

PolyPublie

Translating cyberculture : an analysis of American and Brazilian cultural differences evidenced in the translation of a popular computer text

Author: Ferrao Angela M.
Publication venue: Digital Commons @ NJIT
Publication date: 31/01/2000
Field of study

This thesis examines the translation of a popular American computer book and its translation into Brazilian Portuguese to determine whether current discourses on computers and technology are being literally translated or culturally adapted for their target audience. The selected text adopts a humorous approach to learning new software applications and replaces complicated technical explanations with culturally-bound examples that are inextricably tied to American attitudes toward technology. An analysis of the translation reveals that the ideologies and social codes at work in the book threaten to impede the Brazilian reader\u27s understanding due to the translator\u27s failure to adapt the text for the target audience

Digital Commons @ New Jersey Institute of Technology (NJIT)

Methods for improving entity linking and exploiting social media messages across crises

Author: Stoffalette Joao Renato
Publication venue: Hannover : Institutionelles Repositorium der Gottfried Wilhelm Leibniz Unviersität Hannover
Publication date: 01/01/2023
Field of study

Entity Linking (EL) is the task of automatically identifying entity mentions in texts and resolving them to a corresponding entity in a reference knowledge base (KB). There is a large number of tools available for different types of documents and domains, however the literature in entity linking has shown the quality of a tool varies across different corpus and depends on specific characteristics of the corpus it is applied to. Moreover the lack of precision on particularly ambiguous mentions often spoils the usefulness of automated disambiguation results in real world applications. In the first part of this thesis I explore an approximation of the difficulty to link entity mentions and frame it as a supervised classification task. Classifying difficult to disambiguate entity mentions can facilitate identifying critical cases as part of a semi-automated system, while detecting latent corpus characteristics that affect the entity linking performance. Moreover, despiteless the large number of entity linking tools that have been proposed throughout the past years, some tools work better on short mentions while others perform better when there is more contextual information. To this end, I proposed a solution by exploiting results from distinct entity linking tools on the same corpus by leveraging their individual strengths on a per-mention basis. The proposed solution demonstrated to be effective and outperformed the individual entity systems employed in a series of experiments. An important component in the majority of the entity linking tools is the probability that a mentions links to one entity in a reference knowledge base, and the computation of this probability is usually done over a static snapshot of a reference KB. However, an entity’s popularity is temporally sensitive and may change due to short term events. Moreover, these changes might be then reflected in a KB and EL tools can produce different results for a given mention at different times. I investigated the prior probability change over time and the overall disambiguation performance using different KB from different time periods. The second part of this thesis is mainly concerned with short texts. Social media has become an integral part of the modern society. Twitter, for instance, is one of the most popular social media platforms around the world that enables people to share their opinions and post short messages about any subject on a daily basis. At first I presented one approach to identifying informative messages during catastrophic events using deep learning techniques. By automatically detecting informative messages posted by users during major events, it can enable professionals involved in crisis management to better estimate damages with only relevant information posted on social media channels, as well as to act immediately. Moreover I have also performed an analysis study on Twitter messages posted during the Covid-19 pandemic. Initially I collected 4 million tweets posted in Portuguese since the begining of the pandemic and provided an analysis of the debate aroud the pandemic. I used topic modeling, sentiment analysis and hashtags recomendation techniques to provide isights around the online discussion of the Covid-19 pandemic

Institutionelles Repositorium der Leibniz Universität Hannover

Investigating norms in the brazilian official translation of semiotic items, culture-bound items, and translator's paratextual interventions

Author: Nascimento Lucia de Almeida e Silva
Publication venue: Florianópolis, SC
Publication date: 01/01/2006
Field of study

Tese (doutorado) - Universidade Federal de Santa Catarina, Centro de Comunicação e Expressão. Programa de Pós-Graduação em Letras/Inglês e Literatura Correspondente.A descriptive approach is used in this study to investigate the norms that are responsible for the constraints limiting the translator's choices when dealing with three specific aspects of official translation in Brazil: the translation of semiotic items; the translation of culture-bound items, and the insertion of paratextual interventions. An analysis was conducted of translations of the following documents: academic transcripts, birth or marriage certificates, driver's licenses, police record certificates and diplomas. By using these textual sources, and also extratextual sources, this study sought to answer the following questions: What are the strategies most frequently employed by the 42 official translators participating in this study when translating coats of arms, stamps and signatures? How are school names, units of measurement and some specific phraseologisms commonly found in official documents translated? What kinds of translator's comments and notes do official translators usually add to their translated texts? The strategies used were analyzed, and possible reasons for the translator's behavior were suggested. In addition, categorizations were proposed for the strategies employed in the translation of semiotic items and for the types of translator's interventions appearing in official translations done in Brazil with the Portuguese-English language pair. Este estudo utiliza uma abordagem descritiva para investigar as normas que impõem restrições às opções do tradutor ao lidar com três aspectos específicos da tradução juramentada no Brasil: a tradução de itens semióticos; a tradução de marcadores culturais e a inclusão de intervenções paratextuais. Traduções dos seguintes documentos foram analisadas: históricos escolares, certidões de nascimento ou casamento, carteiras de habilitação, atestados de antecedentes e diplomas. Utilizando essas fontes textuais bem como fontes extra-textuais, este estudo objetivou responder às seguintes perguntas: Quais as estratégias mais freqüentemente utilizadas pelos 42 Tradutores Juramentados que participaram deste estudo ao traduzir brasões, carimbos e assinaturas? Como são traduzidos os nomes de escolas, as unidades de medidas e alguns fraseologismos específicos normalmente encontrados nos documentos oficiais traduzidos? Que tipos de comentários e notas os tradutores juramentados normalmente inserem em suas traduções? As estratégias utilizadas foram analisadas e foram sugeridas as possíveis razões para o comportamento tradutório. Além disso, foram propostas categorizações para as estratégias utilizadas na tradução de itens semióticos e para os tipos de intervenções do tradutor encontradas nas traduções juramentadas feitas no Brasil com o par lingüístico português-inglês

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositório Institucional da UFSC

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Benchmarks for Pir\'a 2.0, a Reading Comprehension Dataset about the Ocean, the Brazilian Coast, and Climate Change

Author: Brandão Anarosa A. F.
Costa Anna H. R.
Cozman Fabio G.
José Marcos M.
Nakasato Flávio
Peres Sarajane M.
Pirozelli Paulo
Silveira Igor
Publication venue
Publication date: 19/09/2023
Field of study

Pir\'a is a reading comprehension dataset focused on the ocean, the Brazilian coast, and climate change, built from a collection of scientific abstracts and reports on these topics. This dataset represents a versatile language resource, particularly useful for testing the ability of current machine learning models to acquire expert scientific knowledge. Despite its potential, a detailed set of baselines has not yet been developed for Pir\'a. By creating these baselines, researchers can more easily utilize Pir\'a as a resource for testing machine learning models across a wide range of question answering tasks. In this paper, we define six benchmarks over the Pir\'a dataset, covering closed generative question answering, machine reading comprehension, information retrieval, open question answering, answer triggering, and multiple choice question answering. As part of this effort, we have also produced a curated version of the original dataset, where we fixed a number of grammar issues, repetitions, and other shortcomings. Furthermore, the dataset has been extended in several new directions, so as to face the aforementioned benchmarks: translation of supporting texts from English into Portuguese, classification labels for answerability, automatic paraphrases of questions and answers, and multiple choice candidates. The results described in this paper provide several points of reference for researchers interested in exploring the challenges provided by the Pir\'a dataset.Comment: Accepted at Data Intelligence. Online ISSN 2641-435

arXiv.org e-Print Archive