3 research outputs found

    Utilisation d'outils de Visual Data Mining pour l'exploration d'un ensemble de règles d'association

    Get PDF
    International audienceData Mining aims at extracting maximum of knowledge from huge databases. It is realized by an automatic process or by data visual exploration with interactive tools. Automatic data mining extracts all the patterns which match a set of metrics. The limit of such algorithms is the amount of extracted data which can be larger than the initial data volume. In this article, we focus on association rules extraction with Apriori algorithm. After the description of a characterization model of a set of association rules, we propose to explore the results of a Data Mining algorithm with an interactive visual tool. There are two advantages. First it will visualize the results of the algorithms from different points of view (metrics, rules attributes). Then it allows us to select easily inside large set of rules the most relevant ones

    Categorization of interestingness measures for knowledge extraction

    Full text link
    Finding interesting association rules is an important and active research field in data mining. The algorithms of the Apriori family are based on two rule extraction measures, support and confidence. Although these two measures have the virtue of being algorithmically fast, they generate a prohibitive number of rules most of which are redundant and irrelevant. It is therefore necessary to use further measures which filter uninteresting rules. Many synthesis studies were then realized on the interestingness measures according to several points of view. Different reported studies have been carried out to identify "good" properties of rule extraction measures and these properties have been assessed on 61 measures. The purpose of this paper is twofold. First to extend the number of the measures and properties to be studied, in addition to the formalization of the properties proposed in the literature. Second, in the light of this formal study, to categorize the studied measures. This paper leads then to identify categories of measures in order to help the users to efficiently select an appropriate measure by choosing one or more measure(s) during the knowledge extraction process. The properties evaluation on the 61 measures has enabled us to identify 7 classes of measures, classes that we obtained using two different clustering techniques.Comment: 34 pages, 4 figure

    Algorithmes automatiques pour la fouille visuelle de données et la visualisation de règles d’association : application aux données aéronautiques

    Get PDF
    Depuis quelques années, nous assistons à une véritable explosion de la production de données dans de nombreux domaines, comme les réseaux sociaux ou le commerce en ligne. Ce phénomène récent est renforcé par la généralisation des périphériques connectés, dont l'utilisation est devenue aujourd'hui quasi-permanente. Le domaine aéronautique n'échappe pas à cette tendance. En effet, le besoin croissant de données, dicté par l'évolution des systèmes de gestion du trafic aérien et par les événements, donne lieu à une prise de conscience sur leur importance et sur une nouvelle manière de les appréhender, qu'il s'agisse de stockage, de mise à disposition et de valorisation. Les capacités d'hébergement ont été adaptées, et ne constituent pas une difficulté majeure. Celle-ci réside plutôt dans le traitement de l'information et dans l'extraction de connaissances. Dans le cadre du Visual Analytics, discipline émergente née des conséquences des attentats de 2001, cette extraction combine des approches algorithmiques et visuelles, afin de bénéficier simultanément de la flexibilité, de la créativité et de la connaissance humaine, et des capacités de calculs des systèmes informatiques. Ce travail de thèse a porté sur la réalisation de cette combinaison, en laissant à l'homme une position centrale et décisionnelle. D'une part, l'exploration visuelle des données, par l'utilisateur, pilote la génération des règles d'association, qui établissent des relations entre elles. D'autre part, ces règles sont exploitées en configurant automatiquement la visualisation des données concernées par celles-ci, afin de les mettre en valeur. Pour cela, ce processus bidirectionnel entre les données et les règles a été formalisé, puis illustré, à l'aide d'enregistrements de trafic aérien récent, sur la plate-forme Videam que nous avons développée. Celle-ci intègre, dans un environnement modulaire et évolutif, plusieurs briques IHM et algorithmiques, permettant l'exploration interactive des données et des règles d'association, tout en laissant à l'utilisateur la maîtrise globale du processus, notamment en paramétrant et en pilotant les algorithmes. ABSTRACT : In the past few years, we have seen a large scale data production in many areas, such as social networks and e-business. This recent phenomenon is enhanced by the widespread use of devices, which are permanently connected. The aeronautical field is also involved in this trend. Indeed, its growing need for data, which is driven by air trafic management systems evolution and by events, leads to a widescale focus on its key role and on new ways to manage it. It deals with storage, availability and exploitation. Data hosting capacity, that has been adapted, is not a major challenge. The issue is now in data processing and knowledge extraction from it. Visual Analytics is an emerging field, stemming from the September 2001 events. It combines automatic and visual approaches, in order to benefit simultaneously from human flexibility, creativity and knowledge, and also from processing capacities of computers. This PhD thesis has focused on this combination, by giving to the operator a centered and decisionmaking role. On the one hand, the visual data exploration drives association rules extraction. They correspond to links between the data. On the other hand, these rules are exploited by automatically con_gurating the visualization of the concerned data, in order to highlight it. To achieve this, a bidirectional process has been formalized, between data and rules. It has been illustrated by air trafic recordings, thanks to the Videam platform, that we have developed. By integrating several HMI and algorithmic applications in a modular and upgradeable environment, it allows interactive exploration of both data and association rules. This is done by giving to human the mastering of the global process, especially by setting and driving algorithms
    corecore