1,606 research outputs found
Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation
Corpus-based techniques have proved to be very beneficial in the development of efficient and accurate approaches to word sense disambiguation (WSD) despite the fact that they generally represent relatively shallow knowledge. It has always been thought, however, that WSD could also benefit from deeper knowledge sources. We describe a novel approach to WSD using inductive logic programming to learn theories from first-order logic representations that allows corpus-based evidence to be combined with any kind of background knowledge. This approach has been shown to be effective over several disambiguation tasks using a combination of deep and shallow knowledge sources. Is it important to understand the contribution of the various knowledge sources used in such a system. This paper investigates the contribution of nine knowledge sources to the performance of the disambiguation models produced for the SemEval-2007 English lexical sample task. The outcome of this analysis will assist future work on WSD in concentrating on the most useful knowledge sources
Architectures of Meaning, A Systematic Corpus Analysis of NLP Systems
This paper proposes a novel statistical corpus analysis framework targeted
towards the interpretation of Natural Language Processing (NLP) architectural
patterns at scale. The proposed approach combines saturation-based lexicon
construction, statistical corpus analysis methods and graph collocations to
induce a synthesis representation of NLP architectural patterns from corpora.
The framework is validated in the full corpus of Semeval tasks and demonstrated
coherent architectural patterns which can be used to answer architectural
questions on a data-driven fashion, providing a systematic mechanism to
interpret a largely dynamic and exponentially growing field.Comment: 20 pages, 6 figures, 9 supplementary figures, Lexicon.txt in the
appendi
Recommended from our members
Using linguistic data for English and Spanish verb-noun combination identification
We present a linguistic analysis of a set of English and Spanish verb+noun combinations (VNCs), and a method to use this information to improve VNC identification. Firstly, a sample of frequent VNCs are analysed in-depth and tagged along lexico-semantic and morphosyntactic dimensions, obtaining satisfactory inter-annotator agreement scores. Then, a VNC identification experiment is undertaken, where the analysed linguistic data is combined with chunking information and syntactic dependencies. A comparison between the results of the experiment and the results obtained by a basic detection method shows that VNC identification can be greatly improved by using linguistic information, as a large number of additional occurrences are detected with high precision
Lexical comprehension and production in Alexia system
In language learning, vocabulary is very important. Studies have shown that the dictionary is used very often in a written comprehension task. However, its utility is not always obvious. In this paper we discuss the improvements electronic dictionaries can provide compared to classical paper ones. In lexical access, they help the learner by making the relevant information selection and research easier and then improve the efficiency of usage. Our system, Alexia, contains specific lexical information for learners. In lexical production, computers gives us large possibilities with automatic processing. We will see how we use an analyser and a parser in order to make pedagogical new style activities
Terminology extraction: an analysis of linguistic and statistical approaches
Are linguistic properties and behaviors important to recognize terms? Are statistical measures effective to extract terms? Is it possible to capture a sort of termhood with computation linguistic techniques? Or maybe, terms are too much sensitive to exogenous and pragmatic factors that cannot be confined in computational linguistic? All these questions are still open. This study tries to contribute in the search of an answer, with the belief that it can be found only through a careful experimental analysis of real case studies and a study of their correlation with theoretical insights
An analysis of practical lexicography: a reader (Ed. Fontenelle 2008)
Intended as a companion volume to The Oxford Guide to Practical Lexicography (Atkins and Rundell 2008), Fontenelle's book aims to bring together the most relevant papers in practical lexicography. This review article presents a critical analysis of the success thereof, both in quantitative and qualitative terms
- …