4 research outputs found

    Data Mining: How Popular Is It?

    Get PDF
    Data Mining is a process used in the industry, to facilitate decision making. As the name implies, large volumes of data is mined or sifted, to find useful information for decision making. With the advent of E-business, Data Mining has become more important to practitioners. The purpose of this paper is to find out the importance of Data Mining by looking at the different application areas that have used data mining for decision making

    A Cluster-indexing CBR Model for Collaborative Filtering Recommendation

    Get PDF

    Computer aided identification of biological specimens using self-organizing maps

    Get PDF
    For scientific or socio-economic reasons it is often necessary or desirable that biological material be identified. Given that there are an estimated 10 million living organisms on Earth, the identification of biological material can be problematic. Consequently the services of taxonomist specialists are often required. However, if such expertise is not readily available it is necessary to attempt an identification using an alternative method. Some of these alternative methods are unsatisfactory or can lead to a wrong identification. One of the most common problems encountered when identifying specimens is that important diagnostic features are often not easily observed, or may even be completely absent. A number of techniques can be used to try to overcome this problem, one of which, the Self Organizing Map (or SOM), is a particularly appealing technique because of its ability to handle missing data. This thesis explores the use of SOMs as a technique for the identification of indigenous trees of the Acacia species in KwaZulu-Natal, South Africa. The ability of the SOM technique to perform exploratory data analysis through data clustering is utilized and assessed, as is its usefulness for visualizing the results of the analysis of numerical, multivariate botanical data sets. The SOM’s ability to investigate, discover and interpret relationships within these data sets is examined, and the technique’s ability to identify tree species successfully is tested. These data sets are also tested using the C5 and CN2 classification techniques. Results from both these techniques are compared with the results obtained by using a SOM commercial package. These results indicate that the application of the SOM to the problem of biological identification could provide the start of the long-awaited breakthrough in computerized identification that biologists have eagerly been seeking.Dissertation (MSc)--University of Pretoria, 2011.Computer Scienceunrestricte
    corecore