4 research outputs found

    Report of the First International Workshop on Learning over Multiple Contexts (LMCE 2014)

    Full text link
    © ACM 2015. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ACM SIGKDD Explorations Newsletter http://dx.doi.org/10.1145/2830544.2830551The first international workshop on Learning over Multiple Contexts, devoted to generalization and reuse of machine learning models over multiple contexts, was held on September 19th, 2014, as part of the 7th European machine learning and data mining conference (ECML-PKDD 2014) in Nancy, France. This short report summarizes the presentations and discussions held during the LMCE 2014 workshop, as well as the workshop conclusions and the future agenda.Ferri Ramírez, C.; Flach, P.; Lachiche, N. (2015). Report of the First International Workshop on Learning over Multiple Contexts (LMCE 2014). ACM SIGKDD Explorations Newsletter. 17(1):48-50. doi:10.1145/2830544.2830551S485017

    Exceptional spatio-temporal behavior mining through Bayesian non-parametric modeling

    Get PDF
    Collective social media provides a vast amount of geo-tagged social posts, which contain various records on spatio-temporal behavior. Modeling spatio-temporal behavior on collective social media is an important task for applications like tourism recommendation, location prediction and urban planning. Properly accomplishing this task requires a model that allows for diverse behavioral patterns on each of the three aspects: spatial location, time, and text. In this paper, we address the following question: how to find representative subgroups of social posts, for which the spatio-temporal behavioral patterns are substantially different from the behavioral patterns in the whole dataset? Selection and evaluation are the two challenging problems for finding the exceptional subgroups. To address these problems, we propose BNPM: a Bayesian non-parametric model, to model spatio-temporal behavior and infer the exceptionality of social posts in subgroups. By training BNPM on a large amount of randomly sampled subgroups, we can get the global distribution of behavioral patterns. For each given subgroup of social posts, its posterior distribution can be inferred by BNPM. By comparing the posterior distribution with the global distribution, we can quantify the exceptionality of each given subgroup. The exceptionality scores are used to guide the search process within the exceptional model mining framework to automatically discover the exceptional subgroups. Various experiments are conducted to evaluate the effectiveness and efficiency of our method. On four real-world datasets our method discovers subgroups coinciding with events, subgroups distinguishing professionals from tourists, and subgroups whose consistent exceptionality can only be truly appreciated by combining exceptional spatio-temporal and exceptional textual behavior

    Anytime Discovery of a Diverse Set of Patterns with Monte Carlo Tree Search

    Get PDF
    International audienceThe discovery of patterns that accurately discriminate one class label from another remains a challenging data mining task. Subgroup discovery (SD) is one of the frameworks that enables to elicit such interesting patterns from labeled data. A question remains fairly open: How to select an accurate heuristic search technique when exhaustive enumeration of the pattern space is infeasible? Existing approaches make use of beam-search, sampling, and genetic algorithms for discovering a pattern set that is non-redundant and of high quality w.r.t. a pattern quality measure. We argue that such approaches produce pattern sets that lack of diversity: Only few patterns of high quality, and different enough, are discovered. Our main contribution is then to formally define pattern mining as a game and to solve it with Monte Carlo tree search (MCTS). It can be seen as an exhaustive search guided by random simulations which can be stopped early (limited budget) by virtue of its best-first search property. We show through a comprehensive set of experiments how MCTS enables the anytime discovery of a diverse pattern set of high quality. It out-performs other approaches when dealing with a large pattern search space and for different quality measures. Thanks to its genericity, our MCTS approach can be used for SD but also for many other pattern mining tasks

    ROCsearch: an ROC-guided search strategy for subgroup discovery

    No full text
    Subgroup Discovery (SD) aims to find coherent, easy-to-interpret subsets of the dataset at hand, where something exceptional is going on. Since the resulting subgroups are defined in terms of conditions on attributes of the dataset, this data mining task is ideally suited to be used by non-expert analysts. The typical SD approach uses a heuristic beam search, involving parameters that strongly influence the outcome. Unfortunately, these parameters are often hard to set properly for someone who is not a data mining expert; correct settings depend on properties of the dataset, and on the resulting search landscape. To remove this potential obstacle for casual SD users, we introduce ROCsearch [1], a new ROC-based beam search variant for Subgroup Discovery. On each search level of the beam search, ROCsearch analyzes the interme-diate results in ROC space to automatically determine a sensible search width for the next search level. Thus, beam search parameter setting is taken out of the do-main expert’s hands, lowering the threshold for using Subgroup Discovery. Also, ROCsearch automatically adapts its search behavior to the properties and re-sulting search landscape of the dataset at hand. Aside from these advantages, we also show that ROCsearch is an order of magnitude more efficient than traditional beam search, while its results are equivalent and on large datasets even better than traditional beam search results
    corecore