423 research outputs found

    Explainable subgraphs with surprising densities : a subgroup discovery approach

    Get PDF
    The connectivity structure of graphs is typically related to the attributes of the nodes. In social networks for example, the probability of a friendship between any pair of people depends on a range of attributes, such as their age, residence location, workplace, and hobbies. The high-level structure of a graph can thus possibly be described well by means of patterns of the form `the subgroup of all individuals with a certain properties X are often (or rarely) friends with individuals in another subgroup defined by properties Y', in comparison to what is expected. Such rules present potentially actionable and generalizable insight into the graph. We present a method that finds node subgroup pairs between which the edge density is interestingly high or low, using an information-theoretic definition of interestingness. Additionally, the interestingness is quantified subjectively, to contrast with prior information an analyst may have about the connectivity. This view immediatly enables iterative mining of such patterns. This is the first method aimed at graph connectivity relations between different subgroups. Our method generalizes prior work on dense subgraphs induced by a subgroup description. Although this setting has been studied already, we demonstrate for this special case considerable practical advantages of our subjective interestingness measure with respect to a wide range of (objective) interestingness measures

    Mining and modeling graphs using patterns and priors

    No full text

    The Minimum Description Length Principle for Pattern Mining: A Survey

    Full text link
    This is about the Minimum Description Length (MDL) principle applied to pattern mining. The length of this description is kept to the minimum. Mining patterns is a core task in data analysis and, beyond issues of efficient enumeration, the selection of patterns constitutes a major challenge. The MDL principle, a model selection method grounded in information theory, has been applied to pattern mining with the aim to obtain compact high-quality sets of patterns. After giving an outline of relevant concepts from information theory and coding, as well as of work on the theory behind the MDL and similar principles, we review MDL-based methods for mining various types of data and patterns. Finally, we open a discussion on some issues regarding these methods, and highlight currently active related data analysis problems

    Using Compression to Find Interesting One Dimensional Cellular Automata

    Get PDF

    Learning subjectively interesting data representations

    Get PDF

    GENERIC FRAMEWORKS FOR INTERACTIVE PERSONALIZED INTERESTING PATTERN DISCOVERY

    Get PDF
    The traditional frequent pattern mining algorithms generate an exponentially large number of patterns of which a substantial portion are not much significant for many data analysis endeavours. Due to this, the discovery of a small number of interesting patterns from the exponentially large number of frequent patterns according to a particular user\u27s interest is an important task. Existing works on patter

    Analytic and constructive processes in the comprehension of text

    Get PDF
    This thesis explores the process of comprehension as a purposeful interaction between a reader and the information in a text. The review begins by discussing the difference between educational and psychological perspectives on comprehension. Approaches to the analysis of text structure are then described and models and theories of the representation of knowledge are evaluated. It is argued that these are limited in that they tend to focus either on the text or the reader: they either examine those procedures that are necessary for text analysis or the knowledge structures required for comprehension, storage and retrieval. Those that come nearest to examining the interaction between text and knowledge structures tend to be limited in terms of the texts they can deal with and they do not deal adequately with the predictive aspects of comprehension.Experiments are reported which look at the ongoing predictions made by readers, and how these are affected by factors such as text structure and ‘interestingness’. The experiments provided the opportunity for examining the potential of alternative methodologies (such as the content analysis of open-ended questions). It is felt that it is necessary to examine comprehension using methods which are direct but not intrusive. The studies reported demonstrate that it is possible to obtain reliable measures of a reader's predictions and that these are systematically affected by the structure and content of the text

    Online summarization of dynamic graphs using subjective interestingness for sequential data

    Get PDF
    Algorithms and the Foundations of Software technolog
    • …
    corecore