3,169 research outputs found

    Tradução médica: um mapeamento de teses e dissertações produzidas de 2002 a 2018

    Get PDF
    This is a bibliometric mapping of studies conducted on medical translation in Brazil. The primary objective is to map master’s, doctoral, and post-doctoral studies from 2002 to 2018, identifying the topics covered, the institutions, and the geographic regions in which they were carried out. The significance of the present study lies in the fact that it indicates regions and institutions, where research on medical translation is conducted, and reveals the regions in which this type of study is scarce. In both cases, the mapping can serve as a reference for researchers who are interested in this area of knowledge.     In addition, the mapping can offer input for the implementation of public policies that encourage the development of studies in the area. Data were collected in three digital platforms, namely Domínio Público, Biblioteca Digital de Teses e Dissertações, and Catálogo de Teses e Dissertações da CAPES, by means of typing different keywords in Portuguese, in the search windows available. Of the 14 studies found, 28.6% were doctoral dissertations and 71.4% were master’s theses. All of the studies were conducted in the southeastern and southern regions of Brazil. Cardiology and general biomedicine were the most frequent topics   followed by anesthesiology, orthopedics, cardiovascular surgery, arterial hypertension, physiology, geriatrics, gerontology, nutrition, and pharmacy.Este artigo trata de um mapeamento bibliométrico dos estudos sobre tradução médica realizados no Brasil. O objetivo principal é mapear os estudos de mestrado, doutorado e pós-doutorado realizados no período de 2002 a 2018 e identificar suas afiliações teóricas, as instituições em que os estudos foram realizados e suas regiões geográficas. A pesquisa se mostra relevante por destacar regiões e instituições onde pesquisas na área de tradução de artigos médicos são conduzidas, além de revelar as regiões em que esse tipo de estudo é escasso. Em ambos os casos, o mapeamento pode servir de referência para pesquisadores que tenham interesse de pesquisa na área. Além disso, o mapeamento poder oferecer insumos para a implementação de políticas públicas que fomentem o desenvolvimento de estudos na área. Para a coleta dos dados, foram utilizadas três plataformas digitais, a saber: Domínio Público, Biblioteca Digital de Teses e Dissertações e o Catálogo de Teses e Dissertações da CAPES. Os trabalhos foram coletados a partir de diferentes palavras-chave em português. Dos 14 estudos encontrados, 28,6% eram teses de doutorado e 71,4% eram dissertações de mestrado. Todos os estudos encontrados foram realizados nas regiões sudeste e sul do Brasil. Cardiologia e biomedicina geral foram os tópicos mais frequentes seguidos de anestesiologia, ortopedia, cirurgia cardiovascular, hipertensão arterial, fisiologia, geriatria, gerontologia, nutrição e farmácia

    Natural language processing

    Get PDF
    Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

    Labeling Discourse to Build Academic Persona

    Get PDF
    Academic research is an increasingly competitive activity and scientific writers are under the constant pressure of getting published. Getting past the screening device of the scientific abstract is widely based on the ability to create a discourse that is perceived as coherent, considering the target discourse community and the communicative intention. This study focuses on the use of general ‘labeling’ nouns as a factor of coherence and rhetorical persuasion in scientific abstracts, with specific interest in terms that are determined by an anaphoric ‘this’. Based on the study of PhD abstracts written in English by English and French applicants in several disciplines, my research aims to identify the factors of success and failure in the handling of this device by native and non-native writers. Labeling nouns are identified and semantically classified for each discipline, according to linguistic origin. Case studies show that success requires adequate lexical choice of labeling nouns. It is also based on an appropriate semantic and syntactic connection between the selected labeling noun and the segment it refers to, which requires sufficient general and scientific language proficiency. Didactic applications are then offered in order to raise scientific writers’ awareness of the impact of this type of cohesive device on their credibility

    Otrouha: A Corpus of Arabic ETDs and a Framework for Automatic Subject Classification

    Get PDF
    Although the Arabic language is spoken by more than 300 million people and is one of the six official languages of the United Nations (UN), there has been less research done on Arabic text data (compared to English) in the realm of machine learning, especially in text classification. In the past decade, Arabic data such as news, tweets, etc. have begun to receive some attention. Although automatic text classification plays an important role in improving the browsability and accessibility of data, Electronic Theses and Dissertations (ETDs) have not received their fair share of attention, in spite of the huge number of benefits they provide to students, universities, and future generations of scholars. There are two main roadblocks to performing automatic subject classification on Arabic ETDs. The first is the unavailability of a public corpus of Arabic ETDs. The second is the linguistic complexity of the Arabic language; that complexity is particularly evident in academic documents such as ETDs. To address these roadblocks, this paper presents Otrouha, a framework for automatic subject classification of Arabic ETDs, which has two main goals. The first is building a Corpus of Arabic ETDs and their key metadata such as abstracts, keywords, and title to pave the way for more exploratory research on this valuable genre of data. The second is to provide a framework for automatic subject classification of Arabic ETDs through different classification models that use classical machine learning as well as deep learning techniques. The first goal is aided by searching the AskZad Digital Library, which is part of the Saudi Digital Library (SDL). AskZad provides other key metadata of Arabic ETDs, such as abstract, title, and keywords. The current search results consist of abstracts of Arabic ETDs. This raw data then undergoes a pre-processing phase that includes stop word removal using the Natural Language Tool Kit (NLTK), and word lemmatization using the Farasa API. To date, abstracts of 518 ETDs across 12 subjects have been collected. For the second goal, the preliminary results show that among the machine learning models, binary classification (one-vs.-all) performed better than multiclass classification. The maximum per subject accuracy is 95%, with an average accuracy of 68% across all subjects. It is noteworthy that the binary classification model performed better for some categories than others. For example, Applied Science and Technology shows 95% accuracy, while the category of Administration shows 36%. Deep learning models resulted in higher accuracy but lower F-measure; their overall performance is lower than machine learning models. This may be due to the small size of the dataset as well as the imbalance in the number of documents per category. Work to collect additional ETDs will be aided by collaborative contributions of data from additional sources

    Collaborative machine translation service for scientific texts

    Get PDF
    © 2012 The Authors. Published by ACL. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://www.aclweb.org/anthology/E12-2003French researchers are required to frequently translate into French the description of their work published in English. At the same time, the need for French people to access articles in English, or to international researchers to access theses or papers in French, is incorrectly resolved via the use of generic translation tools. We propose the demonstration of an end-to-end tool integrated in the HAL open archive for enabling efficient translation for scientific texts. This tool can give translation suggestions adapted to the scientific domain, improving by more than 10 points the BLEU score of a generic system. It also provides a post-edition service which captures user post-editing data that can be used to incrementally improve the translations engines. Thus it is helpful for users which need to translate or to access scientific texts

    Metadiscourse: What is it and where is it going?

    Get PDF
    Metadiscourse – the ways in which writers and speakers interact through their use of language with readers and listeners – is a widely used term in current discourse analysis, pragmatics and language teaching. This interest has grown up over the past 40 years driven by a dual purpose. The first is a desire to understand the relationship between language and its contexts of use. That is, how individuals use language to orient to and interpret particular communicative situations, and especially how they draw on their understandings of these to make their intended meanings clear to their interlocutors. The second is to employ this knowledge in the service of language and literacy education. But while many researchers and teachers find it to be a conceptually rich and analytically powerful idea, it is not without difficulties of definition, categorisation and analysis. In this paper I explore the strengths and shortcomings of the concept and map its influence and directions through a state of the art analysis of the main online academic databases and current published research
    corecore