Search CORE

2,665 research outputs found

Discovering Knowledge from Local Patterns with Global Constraints

Author: Soulet Arnaud
Publication venue: Dagstuhl Seminar Proceedings. 07181 - Parallel Universes and Local Patterns
Publication date: 01/01/2007
Field of study

It is well known that local patterns are at the core of a lot of knowledge which may be discovered from data. Nevertheless, use of local patterns is limited by their huge number and computational costs. Several approaches (e.g., condensed representations, pattern set discovery) aim at grouping or synthesizing local patterns to provide a global view of the data. A global pattern is a pattern which is a set or a synthesis of local patterns coming from the data. In this paper, we propose the idea of global constraints to write queries addressing global patterns. A key point is the ability to bias the designing of global patterns according to the expectation of the user. For instance, a global pattern can be oriented towards the search of exceptions or a clustering. It requires to write queries taking into account such biases. Open issues are to design a generic framework to express powerful global constraints and solvers to mine them. We think that global constraints are a promising way to discover relevant global patterns

Dagstuhl Research Online Publication Server

Learning what matters - Sampling interesting patterns

Author: M Bhuiyan
M Boley
M Leeuwen van
M Leeuwen van
S Chakraborty
S Shalev-Shwartz
T Calders
V Dzyuba
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

In the field of exploratory data mining, local structure in data can be described by patterns and discovered by mining algorithms. Although many solutions have been proposed to address the redundancy problems in pattern mining, most of them either provide succinct pattern sets or take the interests of the user into account-but not both. Consequently, the analyst has to invest substantial effort in identifying those patterns that are relevant to her specific interests and goals. To address this problem, we propose a novel approach that combines pattern sampling with interactive data mining. In particular, we introduce the LetSIP algorithm, which builds upon recent advances in 1) weighted sampling in SAT and 2) learning to rank in interactive pattern mining. Specifically, it exploits user feedback to directly learn the parameters of the sampling distribution that represents the user's interests. We compare the performance of the proposed algorithm to the state-of-the-art in interactive pattern mining by emulating the interests of a user. The resulting system allows efficient and interleaved learning and sampling, thus user-specific anytime data exploration. Finally, LetSIP demonstrates favourable trade-offs concerning both quality-diversity and exploitation-exploration when compared to existing methods.Comment: PAKDD 2017, extended versio

arXiv.org e-Print Archive

Crossref

Leiden University Scholary Publications

Optimization of Association Rules Extraction Through Exploitation of Context Dependent Constraints

Author: Botta M.
Esposito R.
Gallo A.
Meo R.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Institutional Research Information System University of Turin

Knowledge data discovery and data mining in a design environment

Author: Duffy Alex
Haffey Mark
Publication venue
Publication date: 01/01/2000
Field of study

Designers, in the process of satisfying design requirements, generally encounter difficulties in, firstly, understanding the problem and secondly, finding a solution [Cross 1998]. Often the process of understanding the problem and developing a feasible solution are developed simultaneously by proposing a solution to gauge the extent to which the solution satisfies the specific requirements. Support for future design activities has long been recognised to exist in the form of past design cases, however the varying degrees of similarity and dissimilarity found between previous and current design requirements and solutions has restrained the effectiveness of utilising past design solutions. The knowledge embedded within past designs provides a source of experience with the potential to be utilised in future developments provided that the ability to structure and manipulate that knowledgecan be made a reality. The importance of providing the ability to manipulate past design knowledge, allows the ranging viewpoints experienced by a designer, during a design process, to be reflected and supported. Data Mining systems are gaining acceptance in several domains but to date remain largely unrecognised in terms of the potential to support design activities. It is the focus of this paper to introduce the functionality possessed within the realm of Data Mining tools, and to evaluate the level of support that may be achieved in manipulating and utilising experiential knowledge to satisfy designers' ranging perspectives throughout a product's development

University of Strathclyde Institutional Repository

Query Rewriting in Itemset Mining

Author: Botta Marco
Esposito Roberto
Meo Rosa
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Abstract. In recent years, researchers have begun to study inductive databases, a new generation of databases for leveraging decision support applications. In this context, the user interacts with the DBMS using advanced, constraint-based languages for data mining where constraints have been specifically introduced to increase the relevance of the results and, at the same time, to reduce its volume. In this paper we study the problem of mining frequent itemsets using an inductive database 1 . We propose a technique for query answering which consists in rewriting the query in terms of union and intersection of the result sets of other queries, previously executed and materialized. Unfortunately, the exploitation of past queries is not always applicable. We then present sufficient conditions for the optimization to apply and show that these conditions are strictly connected with the presence of functional dependencies between the attributes involved in the queries. We show some experiments on an initial prototype of an optimizer which demonstrates that this approach to query answering is not only viable but in many practical cases absolutely necessary since it reduces drastically the execution time

CiteSeerX

Institutional Research Information System University of Turin

Mining Patterns in Networks using Homomorphism

Author: Dries Anton
Nijssen Siegfried
Publication venue
Publication date: 14/10/2011
Field of study

In recent years many algorithms have been developed for finding patterns in graphs and networks. A disadvantage of these algorithms is that they use subgraph isomorphism to determine the support of a graph pattern; subgraph isomorphism is a well-known NP complete problem. In this paper, we propose an alternative approach which mines tree patterns in networks by using subgraph homomorphism. The advantage of homomorphism is that it can be computed in polynomial time, which allows us to develop an algorithm that mines tree patterns in arbitrary graphs in incremental polynomial time. Homomorphism however entails two problems not found when using isomorphism: (1) two patterns of different size can be equivalent; (2) patterns of unbounded size can be frequent. In this paper we formalize these problems and study solutions that easily fit within our algorithm

arXiv.org e-Print Archive

Lirias

CiteSeerX

07181 Abstracts Collection -- Parallel Universes and Local Patterns

Author: Berthold Michael R.
Morik Katharina
Siebes Arno
Publication venue: Dagstuhl Seminar Proceedings. 07181 - Parallel Universes and Local Patterns
Publication date: 01/01/2007
Field of study

From 1 May 2007 to 4 May 2007 the Dagstuhl Seminar 07181 ``Parallel Universes and Local Patterns\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

Dagstuhl Research Online Publication Server

Probabilistic Inductive Querying Using ProbLog

Author: De Raedt Luc
Gutmann Bernd
Kersting Kristian
Kimmig Angelika
Santos Costa Vitor
Toivonen Hannu
Publication venue: Springer
Publication date: 01/01/2009
Field of study

We study how probabilistic reasoning and inductive querying can be combined within ProbLog, a recent probabilistic extension of Prolog. ProbLog can be regarded as a database system that supports both probabilistic and inductive reasoning through a variety of querying mechanisms. After a short introduction to ProbLog, we provide a survey of the different types of inductive queries that ProbLog supports, and show how it can be applied to the mining of large biological networks.Peer reviewe

CiteSeerX

Crossref

Fraunhofer-ePrints

Helsingin yliopiston digitaalinen arkisto

DIAL UCLouvain

Hal-Diderot