127 research outputs found

    Using nurses’ natural language entries to build a concept-oriented terminology for patients’ chief complaints in the emergency department

    Get PDF
    Information about the chief complaint (CC), also known as the patient's reason for seeking emergency care, is critical for patient prioritization for treatment and determination of patient flow through the emergency department (ED). Triage nurses document the CC at the start of the ED visit, and the data are increasingly available in electronic form. Despite the clinical and operational significance of the CC to the ED, there is no standard CC terminology. We propose the construction of concept-oriented nursing terminologies from the actual language used by experts. We use text analysis to extract CC concepts from triage nurses' natural language entries. Our methodology for building the nursing terminology utilizes natural language processing techniques and the Unified Medical Language System

    ESP corpus construction: a plea for a needs-driven approach

    Get PDF

    Tools for Terminology Processing

    Get PDF
    International audienceAutomatic terminology processing appeared 10 years ago when electronic corpora became widely available. Such processing may be statistically or linguistically based and produces terminology resources that can be used in a number of applications : indexing, information retrieval, technology watch, etc. We present the tools that have been developed in the IRIN Institute. They all take as input texts (or collection of texts) and reflect different states of terminology processing: term acquisition, term recognition and term structuring

    Towards the creation of a CNL adapted to requirements writing by combining writing recommendations and spontaneous regularities : example in a Space Project

    Get PDF
    International audienceThe Quality Department of the French National Space Agency (CNES, Centre National d’Études Spatiales) wishes to design a writing guide based on the real and regular writing of requirements. As a first step in this project, the present article proposes a linguistic analysis of requirements written in French by CNES engineers. One of our goals is to determine to what extent they conform to several rules laid down in two existing Controlled Natural Languages (CNLs), namely the Simplified Technical English developed by the AeroSpace and Defense Industries Association of Europe and the Guide for Writing Requirements proposed by the International Council on Systems Engineering. Indeed, although CNES engineers are not obliged to follow any controlled language in their writing of requirements, we believe that language regularities are likely to emerge from this task, mainly due to the writers’ experience. We are seeking to identify these regularities in order to use them as a basis for a new CNL for the writing of requirements. The issue is approached using natural language processing tools to identify sentences that do not comply with the rules or contain specific linguistic phenomena. We further review these sentences to understand why the recommendations cannot (or should not) always be applied when specifying large-scale projects

    Genres, registers, text types, domain, and styles: Clarifying the concepts and navigating a path through the BNC jungle

    Get PDF

    Understandings of language and cognition

    Get PDF

    Evaluation des outils terminologiques : enjeux, difficultés et propositions

    No full text
    International audienceCas particulier parmi les tĂąches de traitement automatique des langues, l'acquisition terminologique n'a guĂšre fait l'objet d'Ă©valuation systĂ©matique jusqu'Ă  prĂ©sent. Les campagnes qui ont eu lieu sont rĂ©centes et limitĂ©es. Il est cependant nĂ©cessaire de conduire des Ă©valuations pour faire le bilan des recherches passĂ©es, mesurer les progrĂšs accomplis et les angles morts. Cet article dĂ©fend l'idĂ©e qu'on peut dĂ©ïŹnir des protocoles d'Ă©valuation comparative mĂȘme pour des tĂąches complexes comme la terminologie computationnelle. La mĂ©thode proposĂ©e s'appuie sur une dĂ©composition des outils d'analyse terminologique en fonctionnalitĂ©s Ă©lĂ©mentaires ainsi que sur la dĂ©ïŹnition de mesures de prĂ©cision et de rappel adaptĂ©es aux problĂšmes terminologiques, Ă  savoir la complexitĂ© des produits terminologiques, la dĂ©pendance aux applications, le rĂŽle de l'interaction avec l'utilisateur et la variabilitĂ© des terminologies de rĂ©fĂ©rence

    Complexity in Translation. An English-Norwegian Study of Two Text Types

    Get PDF
    The present study discusses two primary research questions. Firstly, we have tried to investigate to what extent it is possible to compute the actual translation relation found in a selection of English-Norwegian parallel texts. By this we understand the generation of translations with no human intervention, and we assume an approach to machine translation (MT) based on linguistic knowledge. In order to answer this question, a measurement of translational complexity is applied to the parallel texts. Secondly, we have tried to find out if there is a difference in the degree of translational complexity between the two text types, law and fiction, included in the empirical material. The study is a strictly product-oriented approach to complexity in translation: it disregards aspects related to translation methods, and to the cognitive processes behind translation. What we have analysed are intersubjectively available relations between source texts and existing translations. The degree of translational complexity in a given translation task is determined by the types and amounts of information needed to solve it, as well as by the accessibility of these information sources, and the effort required when they are processed. For the purpose of measuring the complexity of the relation between a source text unit and its target correspondent, we apply a set of four correspondence types, organised in a hierarchy reflecting divisions between different linguistic levels, along with a gradual increase in the degree of translational complexity. In type 1, the least complex type, the corresponding strings are pragmatically, semantically, and syntactically equivalent, down to the level of the sequence of word forms. In type 2, source and target string are pragmatically and semantically equivalent, and equivalent with respect to syntactic functions, but there is at least one mismatch in the sequence of constituents or in the use of grammatical form words. Within type 3, source and target string are pragmatically and semantically equivalent, but there is at least one structural difference violating syntactic functional equivalence between the strings. In type 4, there is at least one linguistically non-predictable, semantic discrepancy between source and target string. The correspondence type hierarchy, ranging from 1 to 4, is characterised by an increase with respect to linguistic divergence between source and target string, an increase in the need for information and in the amount of effort required to translate, and a decrease in the extent to which there exist implications between relations of source-target equivalence at different linguistic levels. We assume that there is a translational relation between the inventories of simple and complex linguistic signs in two languages which is predictable, and hence computable, from information about source and target language systems, and about how the systems correspond. Thus, computable translations are predictable from the linguistic information coded in the source text, together with given, general information about the two languages and their interrelations. Further, we regard non-computable translations to be correspondences where it is not possible to predict the target expression from the information encoded in the source expression, together with given, general information about SL and TL and their interrelations. Non-computable translations require access to additional information sources, such as various kinds of general or task-specific extra-linguistic information, or task-specific linguistic information from the context surrounding the source expression. In our approach, correspondences of types 1–3 constitute the domain of linguistically predictable, or computable, translations, whereas type 4 correspondences belong to the non-predictable, or non-computable, domain, where semantic equivalence is not fulfilled. The empirical method involves extracting translationally corresponding strings from parallel texts, and assigning one of the types defined by the correspondence hierarchy to each recorded string pair. The analysis is applied to running text, omitting no parts of it. Thus, the distribution of the four types of translational correspondence within a set of data provides a measurement of the degree of translational complexity in the parallel texts that the data are extracted from. The complexity measurements of this study are meant to show to what extent we assume that an ideal, rule-based MT system could simulate the given translations, and for this reason the finite clause is chosen as the primary unit of analysis. The work of extracting and classifying translational correspondences is done manually as it requires a bilingually competent human analyst. In the present study, the recorded data cover about 68 000 words. They are compiled from six different text pairs: two of them are law texts, and the remaining four are fiction texts. Comparable amounts of text are included for each text type, and both directions of translation are covered. Since the scope of the investigation is limited, we cannot, on the basis of our analysis, generalise about the degree of translational complexity in the chosen text types and in the language pair English-Norwegian. Calculated in terms of string lengths, the complexity measurement across the entire collection of data shows that as little as 44,8% of all recorded string pairs are classified as computable translational correspondences, i.e. as type 1, 2, or 3, and non-computable string pairs of type 4 constitute a majority (55,2%) of the compiled data. On average, the proportion of computable correspondences is 50,2% in the law data, and 39,6% in fiction. In relation to the question whether it would be fruitful to apply automatic translation to the selected texts, we have considered the workload potentially involved in correcting machine output, and in this respect the difference in restrictedness between the two text types is relevant. Within the non-computable correspondences, the frequency of cases exhibiting only one minimal semantic deviation between source and target string is considerably higher among the data extracted from the law texts than among those recorded from fiction. For this reason we tentatively regard the investigated pairs of law texts as representing a text type where tools for automatic translation may be helpful, if the effort required by post-editing is smaller than that of manual translation. This is possibly the case in one of the law text pairs, where 60,9% of the data involve computable translation tasks. In the other pair of law texts the corresponding figure is merely 38,8%, and the potential helpfulness of automatisation would be even more strongly determined by the edit cost. That text might be a task for computer-aided translation, rather than for MT. As regards the investigated fiction texts, it is our view that post-editing of automatically generated translations would be laborious and not cost effective, even in the case of one text pair showing a relatively low degree of translational complexity. Hence, we concur with the common view that the translation of fiction is not a task for MT

    Survival analysis of author keywords: An application to the library and information sciences area

    Full text link
    "This is the peer reviewed version of the following article: Peset, F, F GarzĂłn-FarinĂłs, LM GonzĂĄlez, X GarcĂ­a-MassĂł, A Ferrer-Sapena, JL Toca-Herrera, and EA SĂĄnchez-PĂ©rez. 2019. "Survival Analysis of Author Keywords: An Application to the Library and Information Sciences Area." Journal of the Association for Information Science and Technology 71 (4). Wiley: 462-73. doi:10.1002/asi.24248, which has been published in final form at https://doi.org/10.1002/asi.24248. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving."[EN] Our purpose is to adapt a statistical method for the analysis of discrete numerical series to the keywords appearing in scientific articles of a given area. As an example, we apply our methodological approach to the study of the keywords in the Library and Information Sciences (LIS) area. Our objective is to detect the new author keywords that appear in a fixed knowledge area in the period of 1 year in order to quantify the probabilities of survival for 10 years as a function of the impact of the journals where they appeared. Many of the new keywords appearing in the LIS field are ephemeral. Actually, more than half are never used again. In general, the terms most commonly used in the LIS area come from other areas. The average survival time of these keywords is approximately 3 years, being slightly higher in the case of words that were published in journals classified in the second quartile of the area. We believe that measuring the appearance and disappearance of terms will allow understanding some relevant aspects of the evolution of a discipline, providing in this way a new bibliometric approach.Peset Mancebo, MF.; GarzĂłn FarinĂłs, MF.; Gonzalez, L.; GarcĂ­a-MassĂł, X.; Ferrer Sapena, A.; Toca-Herrera, JL.; SĂĄnchez PĂ©rez, EA. (2020). Survival analysis of author keywords: An application to the library and information sciences area. Journal of the Association for Information Science and Technology (Online). 71(4):462-473. https://doi.org/10.1002/asi.24248S462473714Aharony, N. (2011). Library and Information Science research areas: A content analysis of articles from the top 10 journals 2007–8. Journal of Librarianship and Information Science, 44(1), 27-35. doi:10.1177/0961000611424819Aizawa, A., & Kageura, K. (2003). Calculating association between technical terms based on co-occurrences in keyword lists of academic papers. Systems and Computers in Japan, 34(3), 85-95. doi:10.1002/scj.1197Athukorala, K., Hoggan, E., Lehtiö, A., Ruotsalo, T., & Jacucci, G. (2013). Information-seeking behaviors of computer scientists: Challenges for electronic literature search tools. Proceedings of the American Society for Information Science and Technology, 50(1), 1-11. doi:10.1002/meet.14505001041The View from Here. (2007). Scholarship in the Digital Age. doi:10.7551/mitpress/7434.003.0012Box-Steffensmeier, J. M., Cunha, R. C., Varbanov, R. A., Hoh, Y. S., Knisley, M. L., & Holmes, M. A. (2015). Survival Analysis of Faculty Retention and Promotion in the Social Sciences by Gender. PLOS ONE, 10(11), e0143093. doi:10.1371/journal.pone.0143093Brophy, J., & Bawden, D. (2005). Is Google enough? Comparison of an internet search engine with academic library resources. Aslib Proceedings, 57(6), 498-512. doi:10.1108/00012530510634235Buckland, M. K. (2012). Obsolescence in subject description. Journal of Documentation, 68(2), 154-161. doi:10.1108/00220411211209168Chang, Y.-W., Huang, M.-H., & Lin, C.-W. (2015). Evolution of research subjects in library and information science based on keyword, bibliographical coupling, and co-citation analyses. Scientometrics, 105(3), 2071-2087. doi:10.1007/s11192-015-1762-8Chen, G., & Xiao, L. (2016). Selecting publication keywords for domain analysis in bibliometrics: A comparison of three methods. Journal of Informetrics, 10(1), 212-223. doi:10.1016/j.joi.2016.01.006Cheng, F.-F., Huang, Y.-W., Yu, H.-C., & Wu, C.-S. (2018). Mapping knowledge structure by keyword co-occurrence and social network analysis. Library Hi Tech, 36(4), 636-650. doi:10.1108/lht-01-2018-0004Colley, A., & Maltby, J. (2008). Impact of the Internet on our lives: Male and female personal perspectives. Computers in Human Behavior, 24(5), 2005-2013. doi:10.1016/j.chb.2007.09.002Dehdarirad, T., Villarroya, A., & Barrios, M. (2014). Research trends in gender differences in higher education and science: a co-word analysis. Scientometrics, 101(1), 273-290. doi:10.1007/s11192-014-1327-2Ding, Y., Chowdhury, G. G., & Foo, S. (2001). Bibliometric cartography of information retrieval research by using co-word analysis. Information Processing & Management, 37(6), 817-842. doi:10.1016/s0306-4573(00)00051-0Dotsika, F., & Watkins, A. (2017). Identifying potentially disruptive trends by means of keyword network analysis. Technological Forecasting and Social Change, 119, 114-127. doi:10.1016/j.techfore.2017.03.020Figuerola, C. G., GarcĂ­a Marco, F. J., & Pinto, M. (2017). Mapping the evolution of library and information science (1978–2014) using topic modeling on LISA. Scientometrics, 112(3), 1507-1535. doi:10.1007/s11192-017-2432-9Gil-Leiva, I., & Alonso-Arroyo, A. (2007). Keywords given by authors of scientific articles in database descriptors. Journal of the American Society for Information Science and Technology, 58(8), 1175-1187. doi:10.1002/asi.20595Halevi, G., & Moed, H. F. (2013). The thematic and conceptual flow of disciplinary research: A citation context analysis of thejournal of informetrics, 2007. Journal of the American Society for Information Science and Technology, 64(9), 1903-1913. doi:10.1002/asi.22897Michael Hall, C. (2011). Publish and perish? Bibliometric analysis, journal ranking and the assessment of research quality in tourism. Tourism Management, 32(1), 16-27. doi:10.1016/j.tourman.2010.07.001Han H. Gui J. &Xu S.(2014).Revealing research themes and their evolutionary trends using bibliometric data based on strategic diagrams (pp. 653–659).https://doi.org/10.1109/ISCC-C.2013.121HjĂžrland, B. (2000). Library and information science: practice, theory, and philosophical basis. Information Processing & Management, 36(3), 501-531. doi:10.1016/s0306-4573(99)00038-2HjĂžrland B. (2017).Library and information science (LIS). In Encyclopedia of Knowledge Organization. Retrieved fromhttp://www.isko.org/cyclo/lis.HjĂžrland, B., & Albrechtsen, H. (1995). Toward a new horizon in information science: Domain-analysis. Journal of the American Society for Information Science, 46(6), 400-425. doi:10.1002/(sici)1097-4571(199507)46:63.0.co;2-yHu, C.-P., Hu, J.-M., Deng, S.-L., & Liu, Y. (2013). A co-word analysis of library and information science in China. Scientometrics, 97(2), 369-382. doi:10.1007/s11192-013-1076-7Kevork, E. K., & Vrechopoulos, A. P. (2009). CRM literature: conceptual and functional insights by keyword analysis. Marketing Intelligence & Planning, 27(1), 48-85. doi:10.1108/02634500910928362Khan, G. F., & Wood, J. (2015). Information technology management domain: emerging themes and keyword analysis. Scientometrics, 105(2), 959-972. doi:10.1007/s11192-015-1712-5Lee, S. (2016). A Study on Research Trends in Public Library Research in Korea Using Keyword Networks. Libri, 66(4). doi:10.1515/libri-2016-0052Leung, X. Y., Sun, J., & Bai, B. (2017). Bibliometrics of social media research: A co-citation and co-word analysis. International Journal of Hospitality Management, 66, 35-45. doi:10.1016/j.ijhm.2017.06.012Li, M. (2018). Classifying and ranking topic terms based on a novel approach: role differentiation of author keywords. Scientometrics, 116(1), 77-100. doi:10.1007/s11192-018-2741-7Liu, J., Tian, J., Kong, X., Lee, I., & Xia, F. (2018). Two decades of information systems: a bibliometric review. Scientometrics, 118(2), 617-643. doi:10.1007/s11192-018-2974-5McClure, C. R., & Bishop, A. (1989). The Status of Research in Library/Information Science: Guarded Optimism. College & Research Libraries, 50(2), 127-143. doi:10.5860/crl_50_02_127Mela, C. F., Roos, J., & Deng, Y. (2013). Invited Paper—A Keyword History of Marketing Science. Marketing Science, 32(1), 8-18. doi:10.1287/mksc.1120.0764Milojević, S., Sugimoto, C. R., Yan, E., & Ding, Y. (2011). The cognitive structure of Library and Information Science: Analysis of article title words. Journal of the American Society for Information Science and Technology, 62(10), 1933-1953. doi:10.1002/asi.21602Niu, X., & Hemminger, B. M. (2011). A study of factors that affect the information-seeking behavior of academic scientists. Journal of the American Society for Information Science and Technology, 63(2), 336-353. doi:10.1002/asi.21669Nosek, B. A., Spies, J. R., & Motyl, M. (2012). Scientific Utopia. Perspectives on Psychological Science, 7(6), 615-631. doi:10.1177/1745691612459058O’Mara-Eves, A., Thomas, J., McNaught, J., Miwa, M., & Ananiadou, S. (2015). Using text mining for study identification in systematic reviews: a systematic review of current approaches. Systematic Reviews, 4(1). doi:10.1186/2046-4053-4-5Onyancha, O. B. (2018). Forty-Five Years of LIS Research Evolution, 1971–2015: An Informetrics Study of the Author-Supplied Keywords. Publishing Research Quarterly, 34(3), 456-470. doi:10.1007/s12109-018-9590-3Peset F. GarzĂłn‐Farinos F. GonzĂĄlez L. GarcĂ­a‐MassĂł X. Ferrer‐Sapena A. Toca‐Herrera J. &SĂĄnchez‐Perez E. (2018f). Supplementary material S6. Survival analysis (Porter Stemmer method). Retrieved fromhttps://figshare.com/s/ec0e34e0aebf5df48a7bRadhakrishnan, S., Erbis, S., Isaacs, J. A., & Kamarthi, S. (2017). Novel keyword co-occurrence network-based methods to foster systematic reviews of scientific literature. PLOS ONE, 12(3), e0172778. doi:10.1371/journal.pone.0172778Rayward, W. B. (2005). The historical development of information infrastructures and the dissemination of knowledge: A personal reflection. Bulletin of the American Society for Information Science and Technology, 31(4), 19-22. doi:10.1002/bult.1720310407Runkler T.A.&Bezdek J.C.(2000).Automatic keyword extraction with relational clustering and Levenshtein distances. Ninth IEEE International Conference on Fuzzy Systems. FUZZ‐ IEEE 2000 (Cat. No.00CH37063). Vol.2 (pp.636–640).https://doi.org/10.1109/FUZZY.2000.839067Santos, J. B., & Irizo, F. J. O. (2005). Modelling citation age data with right censoring. Scientometrics, 62(3), 329-342. doi:10.1007/s11192-005-0025-5Scimago Journal & Country Rank. (n.d.). Retrieved fromhttp://www.scimagojr.com/Singer, J. D., & Willett, J. B. (1993). It’s About Time: Using Discrete-Time Survival Analysis to Study Duration and the Timing of Events. Journal of Educational Statistics, 18(2), 155-195. doi:10.3102/10769986018002155Su, H.-N., & Lee, P.-C. (2010). Mapping knowledge structure by keyword co-occurrence: a first look at journal papers in Technology Foresight. Scientometrics, 85(1), 65-79. doi:10.1007/s11192-010-0259-8SUN, J. (1997). REGRESSION ANALYSIS OF INTERVAL-CENSORED FAILURE TIME DATA. Statistics in Medicine, 16(5), 497-504. doi:10.1002/(sici)1097-0258(19970315)16:53.0.co;2-jTang, R. (2005). Evolution of the interdisciplinary characteristics of information and library science. Proceedings of the American Society for Information Science and Technology, 41(1), 54-63. doi:10.1002/meet.1450410107Tuomaala, O., JĂ€rvelin, K., & Vakkari, P. (2014). Evolution of library and information science, 1965-2005: Content analysis of journal articles. Journal of the Association for Information Science and Technology, 65(7), 1446-1462. doi:10.1002/asi.23034Uddin, S., & Khan, A. (2016). The impact of author-selected keywords on citation counts. Journal of Informetrics, 10(4), 1166-1177. doi:10.1016/j.joi.2016.10.004Vakkari, P. (1994). Library and Information Science: Its Content and Scope. Advances in Librarianship, 1-55. doi:10.1108/s0065-2830(1994)0000018003Walters, W. H., & Wilder, E. I. (2015). Disciplinary, national, and departmental contributions to the literature of library and information science, 2007-2012. Journal of the Association for Information Science and Technology, 67(6), 1487-1506. doi:10.1002/asi.23448Wang, M., & Chai, L. (2018). Three new bibliometric indicators/approaches derived from keyword analysis. Scientometrics, 116(2), 721-750. doi:10.1007/s11192-018-2768-9Wang, Z.-Y., Li, G., Li, C.-Y., & Li, A. (2011). Research on the semantic-based co-word analysis. Scientometrics, 90(3), 855-875. doi:10.1007/s11192-011-0563-yXu, J., Bu, Y., Ding, Y., Yang, S., Zhang, H., Yu, C., & Sun, L. (2018). Understanding the formation of interdisciplinary research from the perspective of keyword evolution: a case study on joint attention. Scientometrics, 117(2), 973-995. doi:10.1007/s11192-018-2897-1Yang, S., Han, R., Wolfram, D., & Zhao, Y. (2016). Visualizing the intellectual structure of information science (2006–2015): Introducing author keyword coupling analysis. Journal of Informetrics, 10(1), 132-150. doi:10.1016/j.joi.2015.12.003Zhang, J., Yu, Q., Zheng, F., Long, C., Lu, Z., & Duan, Z. (2015). Comparing keywords plus of WOS and author keywords: A case study of patient adherence research. Journal of the Association for Information Science and Technology, 67(4), 967-972. doi:10.1002/asi.2343
    • 

    corecore