2,141 research outputs found

    Classifying Amharic News Text Using Self-Organizing Maps

    Get PDF
    The paper addresses using artificial neural networks for classification of Amharic news items. Amharic is the language for countrywide communication in Ethiopia and has its own writing system containing extensive systematic redundancy. It is quite dialectally diversified and probably representative of the languages of a continent that so far has received little attention within the language processing field. The experiments investigated document clustering around user queries using Self-Organizing Maps, an unsupervised learning neural network strategy. The best ANN model showed a precision of 60.0% when trying to cluster unseen data, and a 69.5% precision when trying to classify it

    Mapping the species richness and composition of tropical forests from remotely sensed data with neural networks

    Get PDF
    The understanding and management of biodiversity is often limited by a lack of data. Remote sensing has considerable potential as a source of data on biodiversity at spatial and temporal scales appropriate for biodiversity management. To-date, most remote sensing studies have focused on only one aspect of biodiversity, species richness, and have generally used conventional image analysis techniques that may not fully exploit the data's information content. Here, we report on a study that aimed to estimate biodiversity more fully from remotely sensed data with the aid of neural networks. Two neural network models, feedforward networks to estimate basic indices of biodiversity and Kohonen networks to provide information on species composition, were used. Biodiversity indices of species richness and evenness derived from the remotely sensed data were strongly correlated with those derived from field survey. For example, the predicted tree species richness was significantly correlated with that observed in the field (r=0.69, significant at the 95% level of confidence). In addition, there was a high degree of correspondence (?83%) between the partitioning of the outputs from Kohonen networks applied to tree species and remotely sensed data sets that indicated the potential to map species composition. Combining the outputs of the two sets of neural network based analyses enabled a map of biodiversity to be produce

    Improving self-organising information maps as navigational tools: A semantic approach

    Get PDF
    Purpose - The goal of the research is to explore whether the use of higher-level semantic features can help us to build better self-organising map (SOM) representation as measured from a human-centred perspective. The authors also explore an automatic evaluation method that utilises human expert knowledge encapsulated in the structure of traditional textbooks to determine map representation quality. Design/methodology/approach - Two types of document representations involving semantic features have been explored - i.e. using only one individual semantic feature, and mixing a semantic feature with keywords. Experiments were conducted to investigate the impact of semantic representation quality on the map. The experiments were performed on data collections from a single book corpus and a multiple book corpus. Findings - Combining keywords with certain semantic features achieves significant improvement of representation quality over the keywords-only approach in a relatively homogeneous single book corpus. Changing the ratios in combining different features also affects the performance. While semantic mixtures can work well in a single book corpus, they lose their advantages over keywords in the multiple book corpus. This raises a concern about whether the semantic representations in the multiple book corpus are homogeneous and coherent enough for applying semantic features. The terminology issue among textbooks affects the ability of the SOM to generate a high quality map for heterogeneous collections. Originality/value - The authors explored the use of higher-level document representation features for the development of better quality SOM. In addition the authors have piloted a specific method for evaluating the SOM quality based on the organisation of information content in the map. © 2011 Emerald Group Publishing Limited

    Analysis of Data Clusters Obtained by Self-Organizing Methods

    Full text link
    The self-organizing methods were used for the investigation of financial market. As an example we consider data time-series of Dow Jones index for the years 2002-2003 (R. Mantegna, cond-mat/9802256). In order to reveal new structures in stock market behavior of the companies drawing up Dow Jones index we apply SOM (Self-Organizing Maps) and GMDH (Group Method of Data Handling) algorithms. Using SOM techniques we obtain SOM-maps that establish a new relationship in market structure. Analysis of the obtained clusters was made by GMDH.Comment: 10 pages, 4 figure

    A modified kohonen self-organizing map (KSOM) clustering for four categorical data

    Get PDF
    The Kohonen Self-Organizing Map (KSOM) is one of the Neural Network unsupervised learning algorithms. This algorithm is used in solving problems in various areas, especially in clustering complex data sets. Despite its advantages, the KSOM algorithm has a few drawbacks; such as overlapped cluster and non-linear separable problems. Therefore, this paper proposes a modified KSOM that inspired from pheromone approach in Ant Colony Optimization. The modification is focusing on the distance calculation amongst objects. The proposed algorithm has been tested on four real categorical data that are obtained from UCI machine learning repository; Iris, Seeds, Glass and Wisconsin Breast Cancer Database. From the results, it shows that the modified KSOM has produced accurate clustering result and all clusters can clearly be identified

    Evaluation of Linguistic Features for Word Sense Disambiguation with Self-Organized Document Maps

    Get PDF
    Word sense disambiguation automatically determines the appropriate senses of a word in context. We have previously shown that self-organized document maps have properties similar to a large-scale semantic structure that is useful for word sense disambiguation. This work evaluates the impact of different linguistic features on self-organized document maps for word sense disambiguation. The features evaluated are various qualitative features, e.g. part-of-speech and syntactic labels, and quantitative features, e.g. cut-off levels for word frequency. It is shown that linguistic features help make contextual information explicit. If the training corpus is large even contextually weak features, such as base forms, will act in concert to produce sense distinctions in a statistically significant way. However, the most important features are syntactic dependency relations and base forms annotated with part of speech or syntactic labels. We achieve 62.9%±0.73% correct results on the fine grained lexical task of the English SENSEVAL-2 data. On the 96.7% of the test cases which need no back-off to the most frequent sense we achieve 65.7% correct results.Peer reviewe

    Text mining with the WEBSOM

    Get PDF
    The emerging field of text mining applies methods from data mining and exploratory data analysis to analyzing text collections and to conveying information to the user in an intuitive manner. Visual, map-like displays provide a powerful and fast medium for portraying information about large collections of text. Relationships between text items and collections, such as similarity, clusters, gaps and outliers can be communicated naturally using spatial relationships, shading, and colors. In the WEBSOM method the self-organizing map (SOM) algorithm is used to automatically organize very large and high-dimensional collections of text documents onto two-dimensional map displays. The map forms a document landscape where similar documents appear close to each other at points of the regular map grid. The landscape can be labeled with automatically identified descriptive words that convey properties of each area and also act as landmarks during exploration. With the help of an HTML-based interactive tool the ordered landscape can be used in browsing the document collection and in performing searches on the map. An organized map offers an overview of an unknown document collection helping the user in familiarizing herself with the domain. Map displays that are already familiar can be used as visual frames of reference for conveying properties of unknown text items. Static, thematically arranged document landscapes provide meaningful backgrounds for dynamic visualizations of for example time-related properties of the data. Search results can be visualized in the context of related documents. Experiments on document collections of various sizes, text types, and languages show that the WEBSOM method is scalable and generally applicable. Preliminary results in a text retrieval experiment indicate that even when the additional value provided by the visualization is disregarded the document maps perform at least comparably with more conventional retrieval methods.reviewe
    • …
    corecore