17 research outputs found

    Data Classification and Its Application in Credit Card Approval

    Get PDF
    We are all now living in the information age. The amount of data being collected by businesses, companies and agencies is large. Recent advances in technologies to automate and improve data collection have increased the volumes of data. Lying hidden in all this data is potentially useful information that is rarely made explicit or taken advantage of. In this context, data mining has arisen as an important research area that helps to reveal the hidden useful information from the raw data collected. Many intensive researches have been conducted to enhance the capability of data mining solution in providing the intelligence so that different types of businesses can make informed decisions. This project demonstrates how data mining can address the need of business intelligence in the process of decision-making. An analysis on the field of data mining is done to show how data mining, especially data classification, can help in businesses such as targeted marketing, credit card approval, fraud detection, medical diagnosis, and scientific work. This project is involved with identification of the available algorithms used in data classification and the implementation of C4.5 decision tree induction algorithm in solving the data classifying task. Sample credit card approval dataset is used to demonstrate the functionality of a data mining solution prototype, which includes the typical tasks of a decision tree induction process: data selection, data preprocessing, decision tree induction, tree pruning, rules generation and validation. The result of this application using the sample credit card approval dataset includes a decision tree, a set of rules derived from the decision tree and its accuracy. These outputs help to identify the pattern of applicants who are more likely to be accepted or rejected. The set of rules can be used as part of the knowledge base in expert system or decision support system for financial institutions

    Data Classification and Its Application in Credit Card Approval

    Get PDF
    We are all now living in the information age. The amount of data being collected by businesses, companies and agencies is large. Recent advances in technologies to automate and improve data collection have increased the volumes of data. Lying hidden in all this data is potentially useful information that is rarely made explicit or taken advantage of. In this context, data mining has arisen as an important research area that helps to reveal the hidden useful information from the raw data collected. Many intensive researches have been conducted to enhance the capability of data mining solution in providing the intelligence so that different types of businesses can make informed decisions. This project demonstrates how data mining can address the need of business intelligence in the process of decision-making. An analysis on the field of data mining is done to show how data mining, especially data classification, can help in businesses such as targeted marketing, credit card approval, fraud detection, medical diagnosis, and scientific work. This project is involved with identification of the available algorithms used in data classification and the implementation of C4.5 decision tree induction algorithm in solving the data classifying task. Sample credit card approval dataset is used to demonstrate the functionality of a data mining solution prototype, which includes the typical tasks of a decision tree induction process: data selection, data preprocessing, decision tree induction, tree pruning, rules generation and validation. The result of this application using the sample credit card approval dataset includes a decision tree, a set of rules derived from the decision tree and its accuracy. These outputs help to identify the pattern of applicants who are more likely to be accepted or rejected. The set of rules can be used as part of the knowledge base in expert system or decision support system for financial institutions

    Visual Exploration of Text Collections

    Get PDF
    Despite many technological advances, the information overload problem still prevails in many application areas. It is challenging for users who are inundated with data to explore different facets of a complex information space to extract and put several pieces of facts together into a big picture that allows them to see various aspects of the data. Nevertheless, the availability of data should be embraced, not considered a threat for individuals and businesses alike. As a substantial amount of invaluable information to be explored resides within unstructured text data, there is a need to support users in visual exploration of text collections to obtain useful understandings that can be turned into worthwhile results. In this dissertation, we present our contributions in this area. We propose an approach to support users in exploring collections of text documents based on their interests and knowledge, which are represented by entities within an ontology. This ontology is used to drive the exploration and can be enriched with newly discovered entities matching users\u27 interests in the process. Coordinated multiple views are used to visualize various aspects of text collections in relation to the set of entities of interest to users. To support faceted filtering of a large number of documents, we show how a multi-dimensional visualization can be employed as an alternative to the traditional linear listing of focus items. In this visualization, visual abstraction based on a combination of a conceptual structure and the structural equivalence of documents can be simultaneously used to deal with a large number of items. Furthermore, the approach also enables visual ordering based on the importance of facet values to support prioritized, cross-facet comparisons of focus items. We also report on an approach to support users\u27 comprehension of the distribution of entities within a document based on the classic TileBars paradigm. Our approach employs a simplified version of a matrix reordering technique, which is based on the barycenter heuristic for bigraph edge crossing minimization, to reorder elements of TileBars-based Entities Distribution Views to tackle the visual complexity problem. The resulting reordered views enable users to quickly and easily identify which entities appear in the beginning, the end, or throughout a document. Lastly, our work is also concerned with visual concordance analysis, which supports users in understanding how terms are used within a document by investigating their usage contexts. To abstract away the textual details and yet retain the core facets of a term\u27s contexts for visualization, we employ a statistical topic modeling method to group together words that are thematically related. These groups are used to visualize the gist of a term\u27s usage contexts in a visualization called Context Stamp

    Visual Abstraction and Ordering in Faceted Browsing of Text Collections

    Get PDF
    While faceted navigation interfaces can assist users in exploring an information collection, there is yet little support for users in choosing a relevant item from the set of items returned from a filtering process. In this paper, we propose using a multi-dimensional visualization as an alternative to the linear listing of focus items. We describe how visual abstraction based on a combination of structural equivalence and conceptual structure can be used to deal with a large number of items, as well as visual ordering based on the importance of facet values to support cross-facets comparison of focus items. This visual support for faceted browsing has been developed for visual exploration of text collections.peer-reviewe

    Tight coupling of personal interests with multi-dimensional visualization for exploration and analysis of text collections

    No full text
    In this paper, we present an interactive matrix-based multi-dimensional visualization component which enables the users to explore a text collection along different conceptual dimensions. Of importance in our approach are the tight coupling of the users ’ personal ontologies representing their spheres of interest with the visualization component and the application of barycenter heuristic for edge crossing minimization to enhance its visual display. We also discuss how IVEA, the information visualization tool containing the proposed component, can address the commonly perceived constraints of building a personal ontology from scratch for IVEA to work
    corecore