7,107 research outputs found

    Analyzing domestic violence with topographic maps: a comparative study.

    Get PDF
    Topographic maps are an appealing exploratory instrument for discovering new knowledge from databases. During the recent years, several variations on the Self Organizing Maps (SOM) were introduced in the literature. In this paper, the toroidal Emergent SOM tool and the spherical SOM are used to analyze a text corpus consisting of police reports of all violent incidents that occurred during the first quarter of 2006 in the police region Amsterdam-Amstelland (The Netherlands). It is demonstrated that spherical topographic maps provide a powerful instrument for analyzing this dataset. In addition, the performance of the toroidal Emergent SOM is compared to that of the spherical SOM, and it turned out to be superior to that of an ordinary classifier, applied directly to the data.Topographic maps; Domestic violence; Knowledge discovery in databases; Emergent SOM; BLOSSOM;

    Evaluation of Linguistic Features for Word Sense Disambiguation with Self-Organized Document Maps

    Get PDF
    Word sense disambiguation automatically determines the appropriate senses of a word in context. We have previously shown that self-organized document maps have properties similar to a large-scale semantic structure that is useful for word sense disambiguation. This work evaluates the impact of different linguistic features on self-organized document maps for word sense disambiguation. The features evaluated are various qualitative features, e.g. part-of-speech and syntactic labels, and quantitative features, e.g. cut-off levels for word frequency. It is shown that linguistic features help make contextual information explicit. If the training corpus is large even contextually weak features, such as base forms, will act in concert to produce sense distinctions in a statistically significant way. However, the most important features are syntactic dependency relations and base forms annotated with part of speech or syntactic labels. We achieve 62.9%±0.73% correct results on the fine grained lexical task of the English SENSEVAL-2 data. On the 96.7% of the test cases which need no back-off to the most frequent sense we achieve 65.7% correct results.Peer reviewe

    Towards improving WEBSOM with multi-word expressions

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia InformáticaLarge quantities of free-text documents are usually rich in information and covers several topics. However, since their dimension is very large, searching and filtering data is an exhaustive task. A large text collection covers a set of topics where each topic is affiliated to a group of documents. This thesis presents a method for building a document map about the core contents covered in the collection. WEBSOM is an approach that combines document encoding methods and Self-Organising Maps (SOM) to generate a document map. However, this methodology has a weakness in the document encoding method because it uses single words to characterise documents. Single words tend to be ambiguous and semantically vague, so some documents can be incorrectly related. This thesis proposes a new document encoding method to improve the WEBSOM approach by using multi word expressions (MWEs) to describe documents. Previous research and ongoing experiments encourage us to use MWEs to characterise documents because these are semantically more accurate than single words and more descriptive

    Information maps: tools for document exploration

    Get PDF
    • …
    corecore