1,548 research outputs found

    Building Concept Graphs from Monolingual Dictionary Entries

    Get PDF

    An analysis of The Oxford Guide to practical lexicography (Atkins and Rundell 2008)

    Get PDF
    Since at least a decade ago, the lexicographic community at large has been demanding that a modern textbook be designed - one that Would place corpora in the centre of the lexicographic enterprise. Written by two of the most respected practising lexicographers, this book has finally arrived, and delivers on very many levels. This review article presents a critical analysis of its features

    Building word embeddings from dictionary definitions

    Get PDF

    Multimodal metadiscourse: analysis of glosses of the definitions and examples of an English dictionary

    Get PDF
    In this paper, we examine multimodal resources that perform the metadiscursive function of clarifying the content of definitions and examples in the entries of the Collins COBUILD Illustrated Basic Dictionary of American English (2010). The study is based on Kress and van Leeuwen (2006), regarding the multimodal configuration of texts, on Hyland (1998, 2000, 2007, 2017), concerning the concept of metadiscourse, and on Kumpf (2000), Pontes (2010), Pontes and Fechine (2011, 2012), Fechine (2013), Rocha (2016), and Ribeiro and Pontes (2018), who discuss multimodal metadiscourse. We carried out an overview of the various elements that clarify the content of the definitions and examples in the dictionary and observed that, in general, the content is clarified either through verbal text or images. For this reason, we classified such elements as verbal and visual glosses. Then, we selected and analyzed representative samples of each type of gloss. We concluded that verbal and visual resources rework and even expand the text of definitions and examples in order to facilitate their understanding by a user who has limited knowledge of the English language

    An Evaluation of Pictorial Illustrations in Urdu Dictionaries

    Get PDF
    Pictures are of paramount significance and interest to the dictionary users. This study addresses the issue of the use of pictorial illustrations in Urdu dictionaries available on the market. Pictorial illustration definitely aids in giving clear concept of words and enhances word sense disambiguation as the graphic demonstrations are easier to comprehend than complexities of words. Each word possesses a particular structure and form. With the intention of intelligibility and interpretations of different shapes of words, their proper sequence, and their synonyms, the dictionaries are brought into being. Dictionaries provide words but lack in usage. Most of the words are scholarly items and are in sparing use of mere scholars and philosophers. According to Al-Kasimi (1977), the use of pictorial illustrations should not be arbitrary and incidental. Meaning recognition job is made easy with graphic illustrations. To ascertain to what degree and for what ends, Urdu dictionaries exercise graphic demonstrations, a survey, based on Stein (1991), was executed. The dictionaries were indiscriminately singled out and looked into for any type of graphic demonstration. This practice was made to know about the dictionaries which deploy pictures for the purpose of illustration. This analysis reflected on the disposal of the demonstrations. Based on the survey, it has been found that not many of the existing dictionaries engage graphic demonstrations authentically. There were found a lot more discrepancies in Urdu dictionaries regarding the use of pictorial illustrations. The study concludes that ostensive addressing is a crucial lexicographical device to assist the lexicographer magnanimously to communicate all the requisite constituents and facets of headwords (lemmas).</p

    Cross-language multi-media information retrieval

    Get PDF

    Bootstrapping Lexical Choice via Multiple-Sequence Alignment

    Get PDF
    An important component of any generation system is the mapping dictionary, a lexicon of elementary semantic expressions and corresponding natural language realizations. Typically, labor-intensive knowledge-based methods are used to construct the dictionary. We instead propose to acquire it automatically via a novel multiple-pass algorithm employing multiple-sequence alignment, a technique commonly used in bioinformatics. Crucially, our method leverages latent information contained in multi-parallel corpora -- datasets that supply several verbalizations of the corresponding semantics rather than just one. We used our techniques to generate natural language versions of computer-generated mathematical proofs, with good results on both a per-component and overall-output basis. For example, in evaluations involving a dozen human judges, our system produced output whose readability and faithfulness to the semantic input rivaled that of a traditional generation system.Comment: 8 pages; to appear in the proceedings of EMNLP-200

    Kyoto: An Integrated System for Specific Domain WSD

    Get PDF
    This document describes the preliminary release of the integrated Kyoto system for speci&#64257;c domain WSD. The system uses concept miners (Tybots) to extract domain-related terms and produces a domain-related thesaurus, followed by knowledge-based WSD based on wordnet graphs (UKB). The resulting system can be applied to any language with a lexical knowledge base, and is based on publicly available software and resources. Our participation in Semeval task #17 focused on producing running systems for all languages in the task, and we attained good results in all except Chinese. Due to the pressure of the time-constraints in the competition, the system is still under development, and we expect results to improve in the near future

    Cross-language plagiarism detection using multilingual semantic network

    Full text link
    The final publication is available at Springer via http://10.1007/978-3-642-36973-5_66Cross-language plagiarism refers to the type of plagiarism where the source and suspicious documents are in different languages. Plagiarism detection across languages is still in its infancy state. In this article, we propose a new graph-based approach that uses a multilingual semantic network to compare document paragraphs in different languages. In order to investigate the proposed approach, we used the German-English and Spanish-English cross-language plagiarism cases of the PAN-PC¿11 corpus. We compare the obtained results with two state-of-the-art models. Experimental results indicate that our graph-based approach is a good alternative for cross-language plagiarism detectionWe thank the Conselleria d′educació, Formació i Ocupació of the Generalitat Valenciana for funding the work of the first author with the Gerónimo Forteza program. The research has been carried out in the framework of the European Commission WIQ-EI IRSES project (no. 269180) and the VLC/CAMPUS Microcluster on Multimodal Interaction in Intelligent Systems.Franco Salvador, M.; Gupta, PA.; Rosso ., P. (2013). Cross-language plagiarism detection using multilingual semantic network. En Advances in Information Retrieval. Springer Verlag (Germany). 7814:710-713. https://doi.org/10.1007/978-3-642-36973-5_66S7107137814Barrón-Cedeño, A.: On the mono- and cross-language detection of text re-use and plagiarism. Ph.D. thesis, Universitat Politènica de València (2012)Barrón-Cedeño, A., Rosso, P., Pinto, D., Juan, A.: On cross-lingual plagiarism analysis using a statistical model. In: Proceedings of the ECAI 2008 Workshop on Uncovering Plagiarism, Authorship and Social Software Misuse, PAN 2008 (2008)Havasi, C.: Conceptnet 3: A flexible, multilingual semantic network for common sense knowledge. In: The 22nd Conference on Artificial Intelligence (2007)Mcnamee, P., Mayfield, J.: Character n-gram tokenization for European language text retrieval. Inf. Retr. 7(1-2), 73–97 (2004)Montes-y-Gómez, M., Gelbukh, A., López-López, A., Baeza-Yates, R.: Flexible Comparison of Conceptual GraphsWork done under partial support of CONACyT, CGEPI-IPN, and SNI, Mexico. In: Mayr, H.C., Lazanský, J., Quirchmayr, G., Vogel, P. (eds.) DEXA 2001. LNCS, vol. 2113, pp. 102–111. Springer, Heidelberg (2001)Navigli, R., Ponzetto, S.P.: Babelnet: building a very large multilingual semantic network. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, Stroudsburg, PA, USA, pp. 216–225 (2010)Potthast, M., Barrón-Cedeño, A., Stein, B., Rosso, P.: Cross-language plagiarism detection. Language Resources and Evaluation, Special Issue on Plagiarism and Authorship Analysis 45(1) (2011)Potthast, M., Eiselt, A., Barrón-Cedeño, A., Stein, B., Rosso, P.: Overview of the 3rd international competition on plagiarism detection. In: CLEF (Notebook Papers/Labs/Workshop) (2011
    corecore