5 research outputs found

    An Interactive Visualization Environment for Data Exploration Using Points of Interest

    No full text
    International audienceWe present in this paper an interactive method for numeric or symbolic data visualization that allows a domain expert to extract useful knowledge and information. We propose a new approach based on points of interest (POI) but in the context of visual data mining. POIs are located on a circle, and data are displayed within this circle according to their similarities to these POI. Interactive actions are possible: selection, zoom, dynamical change of POI. We evaluate the properties of such visualization with standard data with known characteristics. We describe an industrial application which explores results from satisfaction inquiries

    Usage-driven Maintenance of Knowledge Organization Systems

    Full text link
    Knowledge Organization Systems (KOS) are typically used as background knowledge for document indexing in information retrieval. They have to be maintained and adapted constantly to reflect changes in the domain and the terminology. In this thesis, approaches are provided that support the maintenance of hierarchical knowledge organization systems, like thesauri, classifications, or taxonomies, by making information about the usage of KOS concepts available to the maintainer. The central contribution is the ICE-Map Visualization, a treemap-based visualization on top of a generalized statistical framework that is able to visualize almost arbitrary usage information. The proper selection of an existing KOS for available documents and the evaluation of a KOS for different indexing techniques by means of the ICE-Map Visualization is demonstrated. For the creation of a new KOS, an approach based on crowdsourcing is presented that uses feedback from Amazon Mechanical Turk to relate terms hierarchically. The extension of an existing KOS with new terms derived from the documents to be indexed is performed with a machine-learning approach that relates the terms to existing concepts in the hierarchy. The features are derived from text snippets in the result list of a web search engine. For the splitting of overpopulated concepts into new subconcepts, an interactive clustering approach is presented that is able to propose names for the new subconcepts. The implementation of a framework is described that integrates all approaches of this thesis and contains the reference implementation of the ICE-Map Visualization. It is extendable and supports the implementation of evaluation methods that build on other evaluations. Additionally, it supports the visualization of the results and the implementation of new visualizations. An important building block for practical applications is the simple linguistic indexer that is presented as minor contribution. It is knowledge-poor and works without any training. This thesis applies computer science approaches in the domain of information science. The introduction describes the foundations in information science; in the conclusion, the focus is set on the relevance for practical applications, especially regarding the handling of different qualities of KOSs due to automatic and semiautomatic maintenance

    Visualisation et fouille interactive de données à base de points d'intérêts

    No full text
    In this thesis, we present the problem of the visual data mining. We generally notice that it is specific to the types of data and that it is necessary to spend a long time to analyze the results in order to obtain an answer on the aspect of data. In this thesis, we have developed an interactive visualization environment for data exploration using points of interest. This tool visualizes all types of data and is generic because it uses only one similarity measure. These methods must be able to deal with large data sets. We also sought to improve the performances of our visualization algorithms, thus we managed to represent one million data. We also extended our tool to the data clustering. Most existing data clustering methods work in an automatic way, the user is not implied iin the process. We try to involve more significantly the user role in the data clustering process in order to improve his comprehensibility of the data results.Dans ce travail de thèse, nous présentons le problème de la visualisation et la fouille de données. Nous remarquons généralement que les méthodes de visualisation sont propres aux types de données et qu'il est nécessaire de passer beaucoup de temps à analyser les résultats afin d'obtenir une réponse satisfaisante sur l'aspect de celle-ci. Nous avons donc développé une méthode de visualisation basée sur des points d'intérêts. Cet outil visualise tous types de données et est générique car il utilise seulement une mesure de similarité. Par ailleurs ces méthodes doivent pouvoir traiter des grands volumes de données. Nous avons aussi cherché à améliorer les performances de nos algorithmes de visualisation, c'est ainsi que nous sommes parvenus à représenter un million de données. Nous avons aussi étendu notre outil à la classification non supervisée de données. La plupart des méthodes actuelles de classificatoin non supervisée de données fonctionnent de manière automatique, l'utilisateur n'est que peu impliqué dans le processus. Nous souhaitons impliquer l'utilisateur de manière plus significative dans le processus de la classification pour améliorer sa compréhension des données.TOURS-BU Sciences Pharmacie (372612104) / SudocTOURS-Polytech'Informat.Product. (372612209) / SudocSudocFranceF

    Visualisation et fouille interactive de données à base de points d'intérêts

    No full text
    In this thesis, we present the problem of the visual data mining. We generally notice that it is specific to the types of data and that it is necessary to spend a long time to analyze the results in order to obtain an answer on the aspect of data. In this thesis, we have developed an interactive visualization environment for data exploration using points of interest. This tool visualizes all types of data and is generic because it uses only one similarity measure. These methods must be able to deal with large data sets. We also sought to improve the performances of our visualization algorithms, thus we managed to represent one million data. We also extended our tool to the data clustering. Most existing data clustering methods work in an automatic way, the user is not implied iin the process. We try to involve more significantly the user role in the data clustering process in order to improve his comprehensibility of the data results.Dans ce travail de thèse, nous présentons le problème de la visualisation et la fouille de données. Nous remarquons généralement que les méthodes de visualisation sont propres aux types de données et qu'il est nécessaire de passer beaucoup de temps à analyser les résultats afin d'obtenir une réponse satisfaisante sur l'aspect de celle-ci. Nous avons donc développé une méthode de visualisation basée sur des points d'intérêts. Cet outil visualise tous types de données et est générique car il utilise seulement une mesure de similarité. Par ailleurs ces méthodes doivent pouvoir traiter des grands volumes de données. Nous avons aussi cherché à améliorer les performances de nos algorithmes de visualisation, c'est ainsi que nous sommes parvenus à représenter un million de données. Nous avons aussi étendu notre outil à la classification non supervisée de données. La plupart des méthodes actuelles de classificatoin non supervisée de données fonctionnent de manière automatique, l'utilisateur n'est que peu impliqué dans le processus. Nous souhaitons impliquer l'utilisateur de manière plus significative dans le processus de la classification pour améliorer sa compréhension des données.TOURS-BU Sciences Pharmacie (372612104) / SudocTOURS-Polytech'Informat.Product. (372612209) / SudocSudocFranceF
    corecore