17,913 research outputs found

    Analysis of errors in the automatic translation of questions for translingual QA systems

    Get PDF
    Purpose – This study aims to focus on the evaluation of systems for the automatic translation of questions destined to translingual question-answer (QA) systems. The efficacy of online translators when performing as tools in QA systems is analysed using a collection of documents in the Spanish language. Design/methodology/approach – Automatic translation is evaluated in terms of the functionality of actual translations produced by three online translators (Google Translator, Promt Translator, and Worldlingo) by means of objective and subjective evaluation measures, and the typology of errors produced was identified. For this purpose, a comparative study of the quality of the translation of factual questions of the CLEF collection of queries was carried out, from German and French to Spanish. Findings – It was observed that the rates of error for the three systems evaluated here are greater in the translations pertaining to the language pair German-Spanish. Promt was identified as the most reliable translator of the three (on average) for the two linguistic combinations evaluated. However, for the Spanish-German pair, a good assessment of the Google online translator was obtained as well. Most errors (46.38 percent) tended to be of a lexical nature, followed by those due to a poor translation of the interrogative particle of the query (31.16 percent). Originality/value – The evaluation methodology applied focuses above all on the finality of the translation. That is, does the resulting question serve as effective input into a translingual QA system? Thus, instead of searching for “perfection”, the functionality of the question and its capacity to lead one to an adequate response are appraised. The results obtained contribute to the development of improved translingual QA systems

    Natural language processing

    Get PDF
    Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

    Predicting Native Language from Gaze

    Get PDF
    A fundamental question in language learning concerns the role of a speaker's first language in second language acquisition. We present a novel methodology for studying this question: analysis of eye-movement patterns in second language reading of free-form text. Using this methodology, we demonstrate for the first time that the native language of English learners can be predicted from their gaze fixations when reading English. We provide analysis of classifier uncertainty and learned features, which indicates that differences in English reading are likely to be rooted in linguistic divergences across native languages. The presented framework complements production studies and offers new ground for advancing research on multilingualism.Comment: ACL 201

    An automatically built named entity lexicon for Arabic

    Get PDF
    We have successfully adapted and extended the automatic Multilingual, Interoperable Named Entity Lexicon approach to Arabic, using Arabic WordNet (AWN) and Arabic Wikipedia (AWK). First, we extract AWN’s instantiable nouns and identify the corresponding categories and hyponym subcategories in AWK. Then, we exploit Wikipedia inter-lingual links to locate correspondences between articles in ten different languages in order to identify Named Entities (NEs). We apply keyword search on AWK abstracts to provide for Arabic articles that do not have a correspondence in any of the other languages. In addition, we perform a post-processing step to fetch further NEs from AWK not reachable through AWN. Finally, we investigate diacritization using matching with geonames databases, MADA-TOKAN tools and different heuristics for restoring vowel marks of Arabic NEs. Using this methodology, we have extracted approximately 45,000 Arabic NEs and built, to the best of our knowledge, the largest, most mature and well-structured Arabic NE lexical resource to date. We have stored and organised this lexicon following the Lexical Markup Framework (LMF) ISO standard. We conduct a quantitative and qualitative evaluation of the lexicon against a manually annotated gold standard and achieve precision scores from 95.83% (with 66.13% recall) to 99.31% (with 61.45% recall) according to different values of a threshold

    Introduction to the special issue on cross-language algorithms and applications

    Get PDF
    With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and efective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.Postprint (published version

    Conclusions

    Get PDF

    Estrategias de comunicación utilizadas por aprendices de español como L2 y los efectos del tipo de tarea

    Get PDF
    Indexación: Scopus; Scielo.This study examines the possible effects of the task type on Spanish L2 learners’ strategic communication in face-to-face interactions with other learners and native speakers (NSs) of Spanish. Data was elicited from 36 interactions between Spanish L2 learners and native speakers of Spanish when carrying out two tasks, a jigsaw and a free-conversation activity. The data collection involved video and audio recording, observation of participants’ interactions and stimulated recall methodology. The spoken data was analysed based on Dӧrnyei and Kӧrmos’ taxonomy (1998) and the interactional CSs from Dӧrnyei and Scott's (1997). Quantitative and qualitative analyses were conducted to determine a possible association between CS use and the task factor as well as to identify the task effects. Findings show that there is an association between the task type and the learners’ use of CSs particularly influenced by the jigsaw. It seems that the task focus influences the use of certain CSs in order to fulfil the demands of each task. It was observed that the linguistic demands of the jigsaw and the cognitive demands of the free-conversation affected more the learners’ use of specific CSs.El objetivo de este estudio fue analizar los efectos del tipo de tarea en las estrategias de comunicación (EsC) que utilizan aprendices de español como L2 al interactuar cara a cara con otros aprendices y hablantes nativos del español. Se recolectó un corpus oral de 36 interacciones entre estos participantes al llevar a cabo dos tareas, una actividad jigsaw y una conversación. La recogida de datos se realizó mediante grabación de video y audio, observación y entrevistas retrospectivas. El análisis de la información se realizó en base a las taxonomías de Dӧrnyei y Kӧrmos (1998) y Dӧrnyei y Scott (1997). Los datos fueron analizados cuantitativa y cualitativamente para determinar una posible asociación entre el uso de EsC y la tarea realizada e identificar los efectos de la tarea. Los resultados indican una asociación entre el tipo de tarea y el uso de EsC, asociación que está particularmente influenciada por la actividad jigsaw. Se observó que el foco de las tareas afecta el uso de ciertas EsC utilizadas para cumplir con las demandas de cada actividad. Las demandas lingüísticas del jigsaw y las cognitivas de la conversación parecen afectar más el uso de ciertas EsC.https://scielo.conicyt.cl/scielo.php?script=sci_arttext&pid=S0718-09342018000100107&lng=en&nrm=iso&tlng=e
    corecore