184,970 research outputs found

    Multiplex Graph Association Rules for Link Prediction

    Get PDF
    Multiplex networks allow us to study a variety of complex systems where nodes connect to each other in multiple ways, for example friend, family, and co-worker relations in social networks. Link prediction is the branch of network analysis allowing us to forecast the future status of a network: which new connections are the most likely to appear in the future? In multiplex link prediction we also ask: of which type? Because this last question is unanswerable with classical link prediction, here we investigate the use of graph association rules to inform multiplex link prediction. We derive such rules by identifying all frequent patterns in a network via multiplex graph mining, and then score each unobserved link's likelihood by finding the occurrences of each rule in the original network. Association rules add new abilities to multiplex link prediction: to predict new node arrivals, to consider higher order structures with four or more nodes, and to be memory efficient. In our experiments, we show that, exploiting graph association rules, we are able to achieve a prediction performance close to an ideal ensemble classifier. Further, we perform a case study on a signed multiplex network, showing how graph association rules can provide valuable insights to extend social balance theory.Comment: Accepted for publication in 15th International Conference on Web and Social Media (ICWSM) 202

    Fast Multiplex Graph Association Rules for Link Prediction

    Full text link
    Multiplex networks allow us to study a variety of complex systems where nodes connect to each other in multiple ways, for example friend, family, and co-worker relations in social networks. Link prediction is the branch of network analysis allowing us to forecast the future status of a network: which new connections are the most likely to appear in the future? In multiplex link prediction we also ask: of which type? Because this last question is unanswerable with classical link prediction, here we investigate the use of graph association rules to inform multiplex link prediction. We derive such rules by identifying all frequent patterns in a network via multiplex graph mining, and then score each unobserved link's likelihood by finding the occurrences of each rule in the original network. Association rules add new abilities to multiplex link prediction: to predict new node arrivals, to consider higher order structures with four or more nodes, and to be memory efficient. We improve over previous work by creating a framework that is also efficient in terms of runtime, which enables an increase in prediction performance. This increase in efficiency allows us to improve a case study on a signed multiplex network, showing how graph association rules can provide valuable insights to extend social balance theory.Comment: arXiv admin note: substantial text overlap with arXiv:2008.0835

    Visualization of Frequent Itemsets with Nested Circular Layout and Bundling Algorithm

    Get PDF
    International audienceFrequent itemset mining is one of the major data mining issues. Once generated by algorithms, the itemsets can be automatically processed, for instance to extract association rules. They can also be explored with visual tools, in order to analyze the emerging patterns. Graphical itemsets representation is a convenient way to obtain an overview of the global interaction structure. However, when the complexity of the database increases, the network may become unreadable. In this paper, we propose to display itemsets on concentric circles, each one being organized to lower the intricacy of the graph through an optimization process. Thanks to a graph bundling algorithm, we finally obtain a compact representation of a large set of itemsets that is easier to exploit. Colors accumulation and interaction operators facilitate the exploration of the new bundle graph and to illustrate how much an itemset is supported by the data

    Learning lost temporal fuzzy association rules

    Get PDF
    Fuzzy association rule mining discovers patterns in transactions, such as shopping baskets in a supermarket, or Web page accesses by a visitor to a Web site. Temporal patterns can be present in fuzzy association rules because the underlying process generating the data can be dynamic. However, existing solutions may not discover all interesting patterns because of a previously unrecognised problem that is revealed in this thesis. The contextual meaning of fuzzy association rules changes because of the dynamic feature of data. The static fuzzy representation and traditional search method are inadequate. The Genetic Iterative Temporal Fuzzy Association Rule Mining (GITFARM) framework solves the problem by utilising flexible fuzzy representations from a fuzzy rule-based system (FRBS). The combination of temporal, fuzzy and itemset space was simultaneously searched with a genetic algorithm (GA) to overcome the problem. The framework transforms the dataset to a graph for efficiently searching the dataset. A choice of model in fuzzy representation provides a trade-off in usage between an approximate and descriptive model. A method for verifying the solution to the hypothesised problem was presented. The proposed GA-based solution was compared with a traditional approach that uses an exhaustive search method. It was shown how the GA-based solution discovered rules that the traditional approach did not. This shows that simultaneously searching for rules and membership functions with a GA is a suitable solution for mining temporal fuzzy association rules. So, in practice, more knowledge can be discovered for making well-informed decisions that would otherwise be lost with a traditional approach.EPSRC DT

    Acquisition of patterns from medical records

    Get PDF
    In recent years, the volume of information available electronically has increased exponentially, and the field of primary health care has not been an exception. The increasing availability of this electronic data, represents an impact on the potential discovery of patterns to predict the risk of new diseases, helping the personalized care and increasing the quality of life. Extracting frequent patterns from medical records represents a huge challenge in Data Mining, knowing that in this context the analysis of the temporality between clinical instances is a must. In the TADIA-MED research project, data containing information on visits of patients at Primary Care Centers (CAP) throughout Catalonia was obtained. All annotations in the textbook that the doctor registers in the health system during visits follow what is called the MEAP structure (Motiu de la consulta, ExploraciĂł, AvaluaciĂł i Pla d'actuaciĂł, in Catalan). The information contained in these MEAPs was classified into Diagnostics, Signs or symptoms, Drugs, or Body parts. This information was represented as a graph and stored in a Neo4J server. In this thesis, a new formulation is presented which defines how to compute the temporal association rules in the explained context. The obtained rules are intended to be diagnostic aid patterns. We also have developed an algorithm that uses our formulation to extract the temporal rules. This algorithm makes it possible to parameterize the desired rules in various aspects with respect to the desired format or temporality. We are also capable of extracting rules at different levels of abstraction. Finally, we have defined a process for evaluating the rules obtained. The designed process will be the evaluation process of the entire TADIA-MED project. In spite of the small volume of available data, the evaluation of the rules obtained has been very promising and will help us to continue improving

    User-Driven Pattern Mining on knowledge graphs: an Archaeological Case Study

    Get PDF
    In recent years, there has been a growing interest from the Digital Humanities in knowledge graphs as data modelling paradigm. Already, many data sets have been published as such and are available in the Linked Open Data cloud. With it, the nature of these data has shifted from unstructured to structured. This presents new opportunities for data mining. In this work, we investigate to what extend data mining can contribute to the understanding of archaeological knowledge, expressed as knowledge graph, and which form would best meet the communities' needs. A case study was held which involved the user-driven mining of generalized association rules. Experiments have shown that the approach yielded mostly plausible patterns, some of which were seen as highly relevant by domain experts

    Web Usage Mining with Evolutionary Extraction of Temporal Fuzzy Association Rules

    Get PDF
    In Web usage mining, fuzzy association rules that have a temporal property can provide useful knowledge about when associations occur. However, there is a problem with traditional temporal fuzzy association rule mining algorithms. Some rules occur at the intersection of fuzzy sets' boundaries where there is less support (lower membership), so the rules are lost. A genetic algorithm (GA)-based solution is described that uses the flexible nature of the 2-tuple linguistic representation to discover rules that occur at the intersection of fuzzy set boundaries. The GA-based approach is enhanced from previous work by including a graph representation and an improved fitness function. A comparison of the GA-based approach with a traditional approach on real-world Web log data discovered rules that were lost with the traditional approach. The GA-based approach is recommended as complementary to existing algorithms, because it discovers extra rules. (C) 2013 Elsevier B.V. All rights reserved
    • 

    corecore