23,430 research outputs found

    Mining Closed Itemsets for Coherent Rules: An Inference Analysis Approach

    Get PDF
    Past observations have shown that a frequent item set mining algorithm are alleged to mine the closed ones because the finish offers a compact and a whole progress set and higher potency. Anyhow, the most recent closed item set mining algorithms works with candidate maintenance combined with check paradigm that is dear in runtime likewise as area usage when support threshold is a smaller amount or the item sets gets long. Here, we show, PEPP with inference analysis that could be a capable approach used for mining closed sequences for coherent rules while not candidate. It implements a unique sequence closure checking format with inference analysis that based mostly on Sequence Graph protruding by an approach labeled Parallel Edge projection and pruning in brief will refer as PEPP. We describe a novel inference analysis approach to prune patterns that tends to derive coherent rules. A whole observation having sparse and dense real-life information sets proved that PEPP with inference analysis performs larger compared to older algorithms because it takes low memory and is quicker than any algorithms those cited in literature frequently

    Exploratory topic modeling with distributional semantics

    Full text link
    As we continue to collect and store textual data in a multitude of domains, we are regularly confronted with material whose largely unknown thematic structure we want to uncover. With unsupervised, exploratory analysis, no prior knowledge about the content is required and highly open-ended tasks can be supported. In the past few years, probabilistic topic modeling has emerged as a popular approach to this problem. Nevertheless, the representation of the latent topics as aggregations of semi-coherent terms limits their interpretability and level of detail. This paper presents an alternative approach to topic modeling that maps topics as a network for exploration, based on distributional semantics using learned word vectors. From the granular level of terms and their semantic similarity relations global topic structures emerge as clustered regions and gradients of concepts. Moreover, the paper discusses the visual interactive representation of the topic map, which plays an important role in supporting its exploration.Comment: Conference: The Fourteenth International Symposium on Intelligent Data Analysis (IDA 2015

    Mining Heterogeneous Multivariate Time-Series for Learning Meaningful Patterns: Application to Home Health Telecare

    Full text link
    For the last years, time-series mining has become a challenging issue for researchers. An important application lies in most monitoring purposes, which require analyzing large sets of time-series for learning usual patterns. Any deviation from this learned profile is then considered as an unexpected situation. Moreover, complex applications may involve the temporal study of several heterogeneous parameters. In that paper, we propose a method for mining heterogeneous multivariate time-series for learning meaningful patterns. The proposed approach allows for mixed time-series -- containing both pattern and non-pattern data -- such as for imprecise matches, outliers, stretching and global translating of patterns instances in time. We present the early results of our approach in the context of monitoring the health status of a person at home. The purpose is to build a behavioral profile of a person by analyzing the time variations of several quantitative or qualitative parameters recorded through a provision of sensors installed in the home

    Subjectively Interesting Subgroup Discovery on Real-valued Targets

    Get PDF
    Deriving insights from high-dimensional data is one of the core problems in data mining. The difficulty mainly stems from the fact that there are exponentially many variable combinations to potentially consider, and there are infinitely many if we consider weighted combinations, even for linear combinations. Hence, an obvious question is whether we can automate the search for interesting patterns and visualizations. In this paper, we consider the setting where a user wants to learn as efficiently as possible about real-valued attributes. For example, to understand the distribution of crime rates in different geographic areas in terms of other (numerical, ordinal and/or categorical) variables that describe the areas. We introduce a method to find subgroups in the data that are maximally informative (in the formal Information Theoretic sense) with respect to a single or set of real-valued target attributes. The subgroup descriptions are in terms of a succinct set of arbitrarily-typed other attributes. The approach is based on the Subjective Interestingness framework FORSIED to enable the use of prior knowledge when finding most informative non-redundant patterns, and hence the method also supports iterative data mining.Comment: 12 pages, 10 figures, 2 tables, conference submissio

    Intrinsically Dynamic Network Communities

    Get PDF
    Community finding algorithms for networks have recently been extended to dynamic data. Most of these recent methods aim at exhibiting community partitions from successive graph snapshots and thereafter connecting or smoothing these partitions using clever time-dependent features and sampling techniques. These approaches are nonetheless achieving longitudinal rather than dynamic community detection. We assume that communities are fundamentally defined by the repetition of interactions among a set of nodes over time. According to this definition, analyzing the data by considering successive snapshots induces a significant loss of information: we suggest that it blurs essentially dynamic phenomena - such as communities based on repeated inter-temporal interactions, nodes switching from a community to another across time, or the possibility that a community survives while its members are being integrally replaced over a longer time period. We propose a formalism which aims at tackling this issue in the context of time-directed datasets (such as citation networks), and present several illustrations on both empirical and synthetic dynamic networks. We eventually introduce intrinsically dynamic metrics to qualify temporal community structure and emphasize their possible role as an estimator of the quality of the community detection - taking into account the fact that various empirical contexts may call for distinct `community' definitions and detection criteria.Comment: 27 pages, 11 figure

    Integration of decision support systems to improve decision support performance

    Get PDF
    Decision support system (DSS) is a well-established research and development area. Traditional isolated, stand-alone DSS has been recently facing new challenges. In order to improve the performance of DSS to meet the challenges, research has been actively carried out to develop integrated decision support systems (IDSS). This paper reviews the current research efforts with regard to the development of IDSS. The focus of the paper is on the integration aspect for IDSS through multiple perspectives, and the technologies that support this integration. More than 100 papers and software systems are discussed. Current research efforts and the development status of IDSS are explained, compared and classified. In addition, future trends and challenges in integration are outlined. The paper concludes that by addressing integration, better support will be provided to decision makers, with the expectation of both better decisions and improved decision making processes
    corecore