407,028 research outputs found

    Structure-semantics interplay in complex networks and its effects on the predictability of similarity in texts

    Get PDF
    There are different ways to define similarity for grouping similar texts into clusters, as the concept of similarity may depend on the purpose of the task. For instance, in topic extraction similar texts mean those within the same semantic field, whereas in author recognition stylistic features should be considered. In this study, we introduce ways to classify texts employing concepts of complex networks, which may be able to capture syntactic, semantic and even pragmatic features. The interplay between the various metrics of the complex networks is analyzed with three applications, namely identification of machine translation (MT) systems, evaluation of quality of machine translated texts and authorship recognition. We shall show that topological features of the networks representing texts can enhance the ability to identify MT systems in particular cases. For evaluating the quality of MT texts, on the other hand, high correlation was obtained with methods capable of capturing the semantics. This was expected because the golden standards used are themselves based on word co-occurrence. Notwithstanding, the Katz similarity, which involves semantic and structure in the comparison of texts, achieved the highest correlation with the NIST measurement, indicating that in some cases the combination of both approaches can improve the ability to quantify quality in MT. In authorship recognition, again the topological features were relevant in some contexts, though for the books and authors analyzed good results were obtained with semantic features as well. Because hybrid approaches encompassing semantic and topological features have not been extensively used, we believe that the methodology proposed here may be useful to enhance text classification considerably, as it combines well-established strategies

    Types and degrees of interpretive resemblance in translation

    Get PDF
    This articles explores one of the types of interpretive resemblance found in translation, namely, resemblance between concepts. These are cases where the concept encoded involves a resemblance relation between its literal import and the meaning it communicates, i.e. cases in which words do not literally communicate the concepts they encode. It is argued that translations are often carried out not on the basis of the concept encoded in the original text but on the basis of the actual concept communicated. This constitutes one of the sources of discrepancy found between original and target texts. In these cases, the translation encodes not what was encoded originally but (the translator's interpretation of what) the source concept was intended to communicate. There are three ways in which what is communicated by a concept may depart from what it encodes: concept narrowing, concept loosening, and echoic uses of concepts. In addition to discussing these processes in relation to translation, arguments are put forward for the existence of a further resemblance possibility: concept widening

    AN ANALYSIS OF ALTERNATIVE NET PRESENT VALUE CAPITAL INVESTMENT DECISION MODELS

    Get PDF
    We have found that the disagreement between Returns-to-Assets (RTA) and Returns-to-Equity (RTE) proponents is not confined to agricultural economics. Depending on the course they are taking and the accompanying text, students are likely to learn that there is a "right" way to calculate Net Present Values (NPVs), either by the RTA method or the RTE method. In most cases, only one of the two methods is discussed and illustrated with numerical examples. Less common are texts that compare the two methods, discuss their underlying assumptions, or show how the NPVs from the two methods can be reconciled. The paper is organized as follows. The first section of the main body of the paper provides a comparative overview of the RTA and RTE methods; the second section discusses our textbook survey; the final section offers our conclusions. Appendix A contains a brief history of the theoretical development of discounted cash flow (DCF) concepts. Appendix B contains additional details on defining components of NPV models. Finally, Appendix C is a listing of some additional references.Research Methods/ Statistical Methods,

    Analyzing transfer learning impact in biomedical cross lingual named entity recognition and normalization

    Get PDF
    Background The volume of biomedical literature and clinical data is growing at an exponential rate. Therefore, efficient access to data described in unstructured biomedical texts is a crucial task for the biomedical industry and research. Named Entity Recognition (NER) is the first step for information and knowledge acquisition when we deal with unstructured texts. Recent NER approaches use contextualized word representations as input for a downstream classification task. However, distributed word vectors (embeddings) are very limited in Spanish and even more for the biomedical domain. Methods In this work, we develop several biomedical Spanish word representations, and we introduce two Deep Learning approaches for pharmaceutical, chemical, and other biomedical entities recognition in Spanish clinical case texts and biomedical texts, one based on a Bi-STM-CRF model and the other on a BERT-based architecture. Results Several Spanish biomedical embeddigns together with the two deep learning models were evaluated on the PharmaCoNER and CORD-19 datasets. The PharmaCoNER dataset is composed of a set of Spanish clinical cases annotated with drugs, chemical compounds and pharmacological substances; our extended Bi-LSTM-CRF model obtains an F-score of 85.24% on entity identification and classification and the BERT model obtains an F-score of 88.80% . For the entity normalization task, the extended Bi-LSTM-CRF model achieves an F-score of 72.85% and the BERT model achieves 79.97%. The CORD-19 dataset consists of scholarly articles written in English annotated with biomedical concepts such as disorder, species, chemical or drugs, gene and protein, enzyme and anatomy. Bi-LSTM-CRF model and BERT model obtain an F-measure of 78.23% and 78.86% on entity identification and classification, respectively on the CORD-19 dataset. Conclusion These results prove that deep learning models with in-domain knowledge learned from large-scale datasets highly improve named entity recognition performance. Moreover, contextualized representations help to understand complexities and ambiguity inherent to biomedical texts. Embeddings based on word, concepts, senses, etc. other than those for English are required to improve NER tasks in other languages.This work was partially supported by the Research Program of the Ministry of Economy and Competitiveness - Government of Spain, (DeepEMR project TIN2017-87548-C2-1-R)

    Crossing boundaries: the translation and cultural adaptation of folk narratives

    Get PDF

    Summer Season in the Working Settlement: From Mushrooms to People and Back. Uralmash in Fiction and Non-fiction Literature About 1930s

    Get PDF
    The article is devoted to the analysis of several key concepts of texts on construction and the first years of Uralmashzavod work (1930s, 1960s-1970s). They are traced on the material of artistic and diary prose, memoirs and records, in which similar motives are found. Most of the texts are stored in the Museum archive and they are either introduced into the scientific discourse for the first time or it is the first time that they become the object of philological and anthropological studies. “Excluded” loci of Ovalov’s novel are “filled” with memories about Uralmash, but the main memory of “first builders” involves a surprisingly large number of concepts presented in the majority of cases only by fiction. There is an attempt to trace how the chronotope and topography of the working settlement in fiction are reflected in the identity of “old-timers” corresponding to the deep needs and expectations of the factory workers. The reconstruction of the “anthropogenesis” of a “new worker”, emerging from the meta-text about the giant plant is represented alongside with the paradox of the adoption of “wild space”, which after being cleared of the forest for the construction of the plant, upon decades turned in one of the greenest areas of Sverdlovsk-Ekaterinburg.   Keywords: Uralmash, socialist city, first builders, Prishvin’s diaries, A. Bannikov, E. Bannikova, L. Ovalov, V. Anfimov, “Zina Demina”, Jiang Tsinggu
    corecore